Class MLFeatureGenerator
java.lang.Object
es.uam.eps.ir.relison.examples.links.recommendation.MLFeatureGenerator
public class MLFeatureGenerator
extends java.lang.Object
Class for generating learning to rank / machine learning examples.
The examples are written in the LETOR format.
-
Constructor Summary
Constructors Constructor Description MLFeatureGenerator() -
Method Summary
Modifier and Type Method Description private static FeatureInformationcomputeInstances(java.lang.String train, java.lang.String test, boolean directed, boolean weightedSampling, boolean weightedClasses, boolean weightedFeatures, java.lang.String sampling, java.lang.String output, java.util.List<java.lang.String> descriptions, java.util.List<FeatureType> types, java.util.List<RecommendationAlgorithmFunction<java.lang.Long>> similarities, java.util.List<VertexMetricFunction<java.lang.Long>> vertexmetrics, java.util.List<PairMetricFunction<java.lang.Long>> pairmetrics, java.lang.String normalization)Computes the instances for a pair of graphs.static voidmain(java.lang.String[] args)Builds a set of learning to rank instances using similarities between pairs of users.private static es.uam.eps.ir.ranksys.core.Recommendation<java.lang.Long,java.lang.Long>normalize(es.uam.eps.ir.ranksys.core.Recommendation<java.lang.Long,java.lang.Long> recomm, java.lang.String normalization)Normalizes a recommendation.private static <L> java.util.Map<L,java.lang.Double>normalize(java.util.Map<L,java.lang.Double> recomm, java.lang.String normalization)Normalizes a map of values.
-
Constructor Details
-
MLFeatureGenerator
public MLFeatureGenerator()
-
-
Method Details
-
main
public static void main(java.lang.String[] args) throws java.io.IOExceptionBuilds a set of learning to rank instances using similarities between pairs of users.- Parameters:
args- Execution arguments:- Train-instance graph: Graph for obtaining the features of the training set
- Train-class graph: Graph for obtaining the relevance of each training instance
- Test-instance graph: Graph for obtaining the features of the test set
- Test-class graph: Graph for obtaining the relevance of each test set example
- Directed: true if the graph is directed, false otherwise
- Weighted sampling: true if we want to use graph weights for the sampling procedures.
- Weighted classes: true if we take weights as classes (otherwise, binary classes).
- Weighted features: true if we have to use weights for computing the features.
- Train sampling: configuration for the individual sampler used in training.
- Test sampling: configuration for the individual sampler used in test.
- Configuration: YAML file containing the configurations we want to use.
- Train output: File to store the training examples
- Test output: File to store the test examples
- Normalization: Score normalization:
- none: No normalization
- ranksim: Ranking normalization
- minmax: Rescale the scores to interval [0,1]
- z-score: Rescale the query to have 0 mean and 1 variance
- Throws:
java.io.IOException- if something fails while reading/writing
-
computeInstances
private static FeatureInformation computeInstances(java.lang.String train, java.lang.String test, boolean directed, boolean weightedSampling, boolean weightedClasses, boolean weightedFeatures, java.lang.String sampling, java.lang.String output, java.util.List<java.lang.String> descriptions, java.util.List<FeatureType> types, java.util.List<RecommendationAlgorithmFunction<java.lang.Long>> similarities, java.util.List<VertexMetricFunction<java.lang.Long>> vertexmetrics, java.util.List<PairMetricFunction<java.lang.Long>> pairmetrics, java.lang.String normalization) throws java.io.IOExceptionComputes the instances for a pair of graphs.- Parameters:
train- training graph.test- validation/test graph.directed- true if the graph is directed, false otherwise.weightedSampling- true if we use edge weights for the sampling procedure, false if we take binary weights.weightedClasses- true if we use edge weights as classes, false if we use binary classes.weightedFeatures- true if we use edge weights to compute the features, false otherwise.sampling- sampling algorithm grid.output- file in which to output the examples.descriptions- list of features.types- list of types.similarities- list of recommendation algorithms to compute scores as features.vertexmetrics- list of vertex metrics to compute the features.pairmetrics- list of pair metrics to compute the features.normalization- the normalization scheme to use.- Returns:
- the information about the features.
- Throws:
java.io.IOException- if something failed while creating the instances.
-
normalize
private static es.uam.eps.ir.ranksys.core.Recommendation<java.lang.Long,java.lang.Long> normalize(es.uam.eps.ir.ranksys.core.Recommendation<java.lang.Long,java.lang.Long> recomm, java.lang.String normalization)Normalizes a recommendation.- Parameters:
recomm- the recommendation.normalization- the identifier of the normalization algorithm.- Returns:
- the normalized recommendation.
-
normalize
private static <L> java.util.Map<L,java.lang.Double> normalize(java.util.Map<L,java.lang.Double> recomm, java.lang.String normalization)Normalizes a map of values.- Type Parameters:
L- the type of the keys.- Parameters:
recomm- the map.normalization- the identifier of the normalization algorithm.- Returns:
- the normalized recommendation.
-