Class MLFeatureGenerator

java.lang.Object
es.uam.eps.ir.relison.examples.links.recommendation.MLFeatureGenerator

public class MLFeatureGenerator
extends java.lang.Object
Class for generating learning to rank / machine learning examples. The examples are written in the LETOR format.
  • Constructor Summary

    Constructors 
    Constructor Description
    MLFeatureGenerator()  
  • Method Summary

    Modifier and Type Method Description
    private static FeatureInformation computeInstances​(java.lang.String train, java.lang.String test, boolean directed, boolean weightedSampling, boolean weightedClasses, boolean weightedFeatures, java.lang.String sampling, java.lang.String output, java.util.List<java.lang.String> descriptions, java.util.List<FeatureType> types, java.util.List<RecommendationAlgorithmFunction<java.lang.Long>> similarities, java.util.List<VertexMetricFunction<java.lang.Long>> vertexmetrics, java.util.List<PairMetricFunction<java.lang.Long>> pairmetrics, java.lang.String normalization)
    Computes the instances for a pair of graphs.
    static void main​(java.lang.String[] args)
    Builds a set of learning to rank instances using similarities between pairs of users.
    private static es.uam.eps.ir.ranksys.core.Recommendation<java.lang.Long,​java.lang.Long> normalize​(es.uam.eps.ir.ranksys.core.Recommendation<java.lang.Long,​java.lang.Long> recomm, java.lang.String normalization)
    Normalizes a recommendation.
    private static <L> java.util.Map<L,​java.lang.Double> normalize​(java.util.Map<L,​java.lang.Double> recomm, java.lang.String normalization)
    Normalizes a map of values.

    Methods inherited from class java.lang.Object

    clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
  • Constructor Details

  • Method Details

    • main

      public static void main​(java.lang.String[] args) throws java.io.IOException
      Builds a set of learning to rank instances using similarities between pairs of users.
      Parameters:
      args - Execution arguments:
      • Train-instance graph: Graph for obtaining the features of the training set
      • Train-class graph: Graph for obtaining the relevance of each training instance
      • Test-instance graph: Graph for obtaining the features of the test set
      • Test-class graph: Graph for obtaining the relevance of each test set example
      • Directed: true if the graph is directed, false otherwise
      • Weighted sampling: true if we want to use graph weights for the sampling procedures.
      • Weighted classes: true if we take weights as classes (otherwise, binary classes).
      • Weighted features: true if we have to use weights for computing the features.
      • Train sampling: configuration for the individual sampler used in training.
      • Test sampling: configuration for the individual sampler used in test.
      • Configuration: YAML file containing the configurations we want to use.
      • Train output: File to store the training examples
      • Test output: File to store the test examples
      • Normalization: Score normalization:
        • none: No normalization
        • ranksim: Ranking normalization
        • minmax: Rescale the scores to interval [0,1]
        • z-score: Rescale the query to have 0 mean and 1 variance
      Throws:
      java.io.IOException - if something fails while reading/writing
    • computeInstances

      private static FeatureInformation computeInstances​(java.lang.String train, java.lang.String test, boolean directed, boolean weightedSampling, boolean weightedClasses, boolean weightedFeatures, java.lang.String sampling, java.lang.String output, java.util.List<java.lang.String> descriptions, java.util.List<FeatureType> types, java.util.List<RecommendationAlgorithmFunction<java.lang.Long>> similarities, java.util.List<VertexMetricFunction<java.lang.Long>> vertexmetrics, java.util.List<PairMetricFunction<java.lang.Long>> pairmetrics, java.lang.String normalization) throws java.io.IOException
      Computes the instances for a pair of graphs.
      Parameters:
      train - training graph.
      test - validation/test graph.
      directed - true if the graph is directed, false otherwise.
      weightedSampling - true if we use edge weights for the sampling procedure, false if we take binary weights.
      weightedClasses - true if we use edge weights as classes, false if we use binary classes.
      weightedFeatures - true if we use edge weights to compute the features, false otherwise.
      sampling - sampling algorithm grid.
      output - file in which to output the examples.
      descriptions - list of features.
      types - list of types.
      similarities - list of recommendation algorithms to compute scores as features.
      vertexmetrics - list of vertex metrics to compute the features.
      pairmetrics - list of pair metrics to compute the features.
      normalization - the normalization scheme to use.
      Returns:
      the information about the features.
      Throws:
      java.io.IOException - if something failed while creating the instances.
    • normalize

      private static es.uam.eps.ir.ranksys.core.Recommendation<java.lang.Long,​java.lang.Long> normalize​(es.uam.eps.ir.ranksys.core.Recommendation<java.lang.Long,​java.lang.Long> recomm, java.lang.String normalization)
      Normalizes a recommendation.
      Parameters:
      recomm - the recommendation.
      normalization - the identifier of the normalization algorithm.
      Returns:
      the normalized recommendation.
    • normalize

      private static <L> java.util.Map<L,​java.lang.Double> normalize​(java.util.Map<L,​java.lang.Double> recomm, java.lang.String normalization)
      Normalizes a map of values.
      Type Parameters:
      L - the type of the keys.
      Parameters:
      recomm - the map.
      normalization - the identifier of the normalization algorithm.
      Returns:
      the normalized recommendation.