Class MLFeatureGenerator
java.lang.Object
es.uam.eps.ir.relison.examples.links.recommendation.MLFeatureGenerator
public class MLFeatureGenerator
extends java.lang.Object
Class for generating learning to rank / machine learning examples.
The examples are written in the LETOR format.
-
Constructor Summary
Constructors Constructor Description MLFeatureGenerator()
-
Method Summary
Modifier and Type Method Description private static FeatureInformation
computeInstances(java.lang.String train, java.lang.String test, boolean directed, boolean weightedSampling, boolean weightedClasses, boolean weightedFeatures, java.lang.String sampling, java.lang.String output, java.util.List<java.lang.String> descriptions, java.util.List<FeatureType> types, java.util.List<RecommendationAlgorithmFunction<java.lang.Long>> similarities, java.util.List<VertexMetricFunction<java.lang.Long>> vertexmetrics, java.util.List<PairMetricFunction<java.lang.Long>> pairmetrics, java.lang.String normalization)
Computes the instances for a pair of graphs.static void
main(java.lang.String[] args)
Builds a set of learning to rank instances using similarities between pairs of users.private static es.uam.eps.ir.ranksys.core.Recommendation<java.lang.Long,java.lang.Long>
normalize(es.uam.eps.ir.ranksys.core.Recommendation<java.lang.Long,java.lang.Long> recomm, java.lang.String normalization)
Normalizes a recommendation.private static <L> java.util.Map<L,java.lang.Double>
normalize(java.util.Map<L,java.lang.Double> recomm, java.lang.String normalization)
Normalizes a map of values.
-
Constructor Details
-
MLFeatureGenerator
public MLFeatureGenerator()
-
-
Method Details
-
main
public static void main(java.lang.String[] args) throws java.io.IOExceptionBuilds a set of learning to rank instances using similarities between pairs of users.- Parameters:
args
- Execution arguments:- Train-instance graph: Graph for obtaining the features of the training set
- Train-class graph: Graph for obtaining the relevance of each training instance
- Test-instance graph: Graph for obtaining the features of the test set
- Test-class graph: Graph for obtaining the relevance of each test set example
- Directed: true if the graph is directed, false otherwise
- Weighted sampling: true if we want to use graph weights for the sampling procedures.
- Weighted classes: true if we take weights as classes (otherwise, binary classes).
- Weighted features: true if we have to use weights for computing the features.
- Train sampling: configuration for the individual sampler used in training.
- Test sampling: configuration for the individual sampler used in test.
- Configuration: YAML file containing the configurations we want to use.
- Train output: File to store the training examples
- Test output: File to store the test examples
- Normalization: Score normalization:
- none: No normalization
- ranksim: Ranking normalization
- minmax: Rescale the scores to interval [0,1]
- z-score: Rescale the query to have 0 mean and 1 variance
- Throws:
java.io.IOException
- if something fails while reading/writing
-
computeInstances
private static FeatureInformation computeInstances(java.lang.String train, java.lang.String test, boolean directed, boolean weightedSampling, boolean weightedClasses, boolean weightedFeatures, java.lang.String sampling, java.lang.String output, java.util.List<java.lang.String> descriptions, java.util.List<FeatureType> types, java.util.List<RecommendationAlgorithmFunction<java.lang.Long>> similarities, java.util.List<VertexMetricFunction<java.lang.Long>> vertexmetrics, java.util.List<PairMetricFunction<java.lang.Long>> pairmetrics, java.lang.String normalization) throws java.io.IOExceptionComputes the instances for a pair of graphs.- Parameters:
train
- training graph.test
- validation/test graph.directed
- true if the graph is directed, false otherwise.weightedSampling
- true if we use edge weights for the sampling procedure, false if we take binary weights.weightedClasses
- true if we use edge weights as classes, false if we use binary classes.weightedFeatures
- true if we use edge weights to compute the features, false otherwise.sampling
- sampling algorithm grid.output
- file in which to output the examples.descriptions
- list of features.types
- list of types.similarities
- list of recommendation algorithms to compute scores as features.vertexmetrics
- list of vertex metrics to compute the features.pairmetrics
- list of pair metrics to compute the features.normalization
- the normalization scheme to use.- Returns:
- the information about the features.
- Throws:
java.io.IOException
- if something failed while creating the instances.
-
normalize
private static es.uam.eps.ir.ranksys.core.Recommendation<java.lang.Long,java.lang.Long> normalize(es.uam.eps.ir.ranksys.core.Recommendation<java.lang.Long,java.lang.Long> recomm, java.lang.String normalization)Normalizes a recommendation.- Parameters:
recomm
- the recommendation.normalization
- the identifier of the normalization algorithm.- Returns:
- the normalized recommendation.
-
normalize
private static <L> java.util.Map<L,java.lang.Double> normalize(java.util.Map<L,java.lang.Double> recomm, java.lang.String normalization)Normalizes a map of values.- Type Parameters:
L
- the type of the keys.- Parameters:
recomm
- the map.normalization
- the identifier of the normalization algorithm.- Returns:
- the normalized recommendation.
-