A new perspective on classification (2024)

research-article

Authors: Toon Vanderschueren, Bart Baesens, Tim Verdonck, and Wouter Verbeke

Published: 02 July 2024 Publication History

  • 0citation
  • 0
  • Downloads

Metrics

Total Citations0Total Downloads0

Last 12 Months0

Last 6 weeks0

  • Get Citation Alerts

    New Citation Alert added!

    This alert has been successfully added and will be sent to:

    You will be notified whenever a record that you have chosen has been cited.

    To manage your alert preferences, click on the button below.

    Manage my Alerts

    New Citation Alert!

    Please log in to your account

      • View Options
      • References
      • Media
      • Tables
      • Share

    Abstract

    A central problem in business concerns the optimal allocation of limited resources to a set of available tasks, where the payoff of these tasks is inherently uncertain. Typically, such problems are solved using a classification framework, where task outcomes are predicted given a set of characteristics. Then, resources are allocated to the tasks predicted to be the most likely to succeed. We argue, however, that using classification to address task uncertainty is inherently suboptimal as it does not take into account the available capacity. We present a novel solution that directly optimizes the assignment's expected profit given limited, stochastic capacity. This is achieved by optimizing a specific instance of the net discounted cumulative gain, a commonly used class of metrics in learning to rank. We demonstrate that our new method achieves higher expected profit and expected precision compared to a classification approach for a wide variety of application areas.

    Highlights

    We formulate the problem of allocating limited resources to uncertain tasks.

    Our novel solution considers capacity limitation in the optimization.

    Our approach outperforms classification-based methods for a variety of applications.

    References

    [1]

    P.A. Samuelson, W.D. Nordhaus, Economics, 19 ed., McGraw-Hill/Irwin, 2010.

    [2]

    L. Ward Jr., On the optimal allocation of limited resources, Oper. Res. 5 (1957) 815–819.

    [3]

    H. Everett III, Generalized Lagrange multiplier method for solving problems of optimum allocation of resources, Oper. Res. 11 (1963) 399–417.

    Digital Library

    [4]

    A. Calma, W. Ho, L. Shao, H. Li, Operations research: topics, impact, and trends from 1952–2019, Oper. Res. 69 (2021) 1487–1508.

    [5]

    B. Baesens, S. Viaene, D. Van den Poel, J. Vanthienen, G. Dedene, Bayesian neural network learning for repeat purchase modelling in direct marketing, Eur. J. Oper. Res. 138 (2002) 191–211.

    [6]

    B. Baesens, T. Van Gestel, S. Viaene, M. Stepanova, J. Suykens, J. Vanthienen, Benchmarking state-of-the-art classification algorithms for credit scoring, J. Oper. Res. Soc. 54 (2003) 627–635.

    [7]

    W. Verbeke, D. Martens, C. Mues, B. Baesens, Building comprehensible customer churn prediction models with advanced rule induction techniques, Expert Syst. Appl. 38 (2011) 2354–2364.

    [8]

    W. Verbeke, K. Dejaeger, D. Martens, J. Hur, B. Baesens, New insights into churn prediction in the telecommunication sector: a profit driven data mining approach, Eur. J. Oper. Res. 218 (2012) 211–229.

    [9]

    S. Lessmann, B. Baesens, H.-V. Seow, L.C. Thomas, Benchmarking state-of-the-art classification algorithms for credit scoring: an update of research, Eur. J. Oper. Res. 247 (2015) 124–136.

    [10]

    V. Van Vlasselaer, T. Eliassi-Rad, L. Akoglu, M. Snoeck, B. Baesens, Gotcha! Network-based fraud detection for social security fraud, Manag. Sci. 63 (2017) 3090–3110.

    Digital Library

    [11]

    A. Cerioli, L. Barabesi, A. Cerasa, M. Menegatti, D. Perrotta, Newcomb–benford law and the detection of frauds in international trade, Proc. Natl. Acad. Sci. 116 (2019) 106–115.

    [13]

    J. Alonso-Mora, S. Samaranayake, A. Wallar, E. Frazzoli, D. Rus, On-demand high-capacity ride-sharing via dynamic trip-vehicle assignment, Proc. Natl. Acad. Sci. 114 (2017) 462–467.

    [14]

    D. Bertsimas, A. Delarue, S. Martin, Optimizing schools’ start time and bus routes, Proc. Natl. Acad. Sci. 116 (2019) 5943–5948.

    [15]

    B. Toktas, J.W. Yen, Z.B. Zabinsky, Addressing capacity uncertainty in resource-constrained assignment problems, Comput. Oper. Res. 33 (2006) 724–745.

    [16]

    P.A. Krokhmal, P.M. Pardalos, Random assignment problems, Eur. J. Oper. Res. 194 (2009) 1–17.

    [17]

    J. Li, B. Xin, P.M. Pardalos, J. Chen, Solving bi-objective uncertain stochastic resource allocation problems by the cvar-based risk measure and decomposition-based multi-objective evolutionary algorithms, Ann. Oper. Res. 296 (2021) 639–666.

    [18]

    R. Johari, V. Kamble, Y. Kanoria, Matching while learning, Oper. Res. 69 (2021) 655–681.

    [19]

    A. Lodi, G. Zarpellon, On learning and branching: a survey, Top 25 (2017) 207–236.

    [20]

    Y. Bengio, A. Lodi, A. Prouvost, Machine learning for combinatorial optimization: a methodological tour d’horizon, Eur. J. Oper. Res. 290 (2021) 405–421.

    [21]

    P. Donti, B. Amos, J.Z. Kolter, Task-based end-to-end model learning in stochastic optimization, in: I. Guyon, U.V. Luxburg, S. Bengio, H. Wallach, R. Fergus, S. Vishwanathan, R. Garnett (Eds.), Advances in Neural Information Processing Systems, vol. 30, Curran Associates, Inc, 2017, URL: https://proceedings.neurips.cc/paper/2017/file/3fc2c60b5782f641f76bcefc39fb2392-Paper.pdf.

    [22]

    B. Wilder, B. Dilkina, M. Tambe, Melding the data-decisions pipeline: Decision-focused learning for combinatorial optimization, in: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, 2019, pp. 1658–1665.

    [23]

    A.N. Elmachtoub, P. Grigas, Smart “predict, then optimize, Manag. Sci. 68 (1) (2021) 9–26.

    [24]

    J. Mandi, P.J. Stuckey, T. Guns, et al., Smart predict-and-optimize for hard combinatorial optimization problems, in: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 34, 2020, pp. 1603–1610.

    [25]

    J. Kotary, F. Fioretto, P. Van Hentenryck, B. Wilder, End-to-end constrained optimization learning: A survey, arXiv preprint arXiv:2103.16378 2021.

    [26]

    E. Demirović, P.J. Stuckey, J. Bailey, J. Chan, C. Leckie, K. Ramamohanarao, T. Guns, An investigation into prediction+ optimisation for the knapsack problem, in: International Conference on Integration of Constraint Programming, Artificial Intelligence, and Operations Research, Springer, 2019, pp. 241–257.

    [27]

    E. Demirović, P.J. Stuckey, J. Bailey, J. Chan, C. Leckie, K. Ramamohanarao, T. Guns, Predict+ optimise with ranking objectives: Exhaustively learning linear functions, in: Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, IJCAI 2019, Macao, China, August 10–16, 2019, International Joint Conferences on Artificial Intelligence, 2019, pp. 1078–1085.

    [28]

    J. Mandi, J. Kotary, S. Berden, M. Mulamba, V. Bucarey, T. Guns, F. Fioretto, Decision-focused learning: Foundations, state of the art, benchmark and future opportunities, arXiv preprint arXiv:2307.13565 2023.

    [29]

    A.C. Bahnsen, D. Aouada, B. Ottersten, Example-dependent cost-sensitive logistic regression for credit scoring, in: in: 2014 13th International Conference on Machine Learning and Applications, IEEE, 2014, pp. 263–269.

    [30]

    G. Petrides, D. Moldovan, L. Coenen, T. Guns, W. Verbeke, Cost-sensitive learning for profit-driven credit scoring, J. Oper. Res. Soc. (2020) 1–13.

    [31]

    S. Höppner, E. Stripling, B. Baesens, S. Vanden Broucke, T. Verdonck, Profit driven decision trees for churn prediction, Eur. J. Oper. Res. 284 (2020) 920–933.

    [32]

    S. Höppner, B. Baesens, W. Verbeke, T. Verdonck, Instance-dependent cost-sensitive learning for detecting transfer fraud, Eur. J. Oper. Res. 297 (2022) 291–300.

    [33]

    C. Elkan, The foundations of cost-sensitive learning, in: International Joint Conference on Artificial Intelligence, vol. 17, Lawrence Erlbaum Associates Ltd, 2001, pp. 973–978.

    [34]

    G. Petrides, W. Verbeke, Cost-sensitive ensemble learning: a unifying framework, Data Min. Knowl. Disc. (2021) 1–28.

    [35]

    T. Vanderschueren, T. Verdonck, B. Baesens, W. Verbeke, Predict-then-optimize or predict-and-optimize? An empirical evaluation of cost-sensitive learning strategies, Inf. Sci. 594 (2022) 400–415.

    [36]

    J. Davis, M. Goadrich, The relationship between precision-recall and roc curves, in: Proceedings of the 23rd International Conference on Machine Learning, 2006, pp. 233–240.

    Digital Library

    [37]

    A. Dal Pozzolo, G. Boracchi, O. Caelen, C. Alippi, G. Bontempi, Credit card fraud detection: a realistic modeling and a novel learning strategy, IEEE Trans. Neural Netw. Learn. Syst. 29 (2017) 3784–3797.

    [38]

    I. Bose, X. Chen, Quantitative models for direct marketing: a review from systems perspective, Eur. J. Oper. Res. 195 (2009) 1–16.

    [39]

    J. Hadden, A. Tiwari, R. Roy, D. Ruta, Computer assisted customer churn management: state-of-the-art and future trends, Comput. Oper. Res. 34 (2007) 2902–2917.

    [40]

    D.A. Shifman, I. Cohen, K. Huang, X. Xian, G. Singer, An adaptive machine learning algorithm for the resource-constrained classification problem, Eng. Appl. Artif. Intell. 119 (2023).

    [41]

    X. Yang, K. Tang, X. Yao, A learning-to-rank approach to software defect prediction, IEEE Trans. Reliab. 64 (2014) 234–246.

    [42]

    L. Coenen, W. Verbeke, T. Guns, Machine learning methods for short-term probability of default: a comparison of classification, regression and ranking methods, J. Oper. Res. Soc. (2020) 1–16.

    [43]

    F. Devriendt, J. Van Belle, T. Guns, W. Verbeke, Learning to rank for uplift modeling, IEEE Trans. Knowl. Data Eng 34 (10) (2020) 4888–4904.

    [44]

    R. McBride, K. Wang, Z. Ren, W. Li, Cost-sensitive learning to rank, in: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, 2019, pp. 4570–4577.

    [45]

    Y. Wang, L. Wang, Y. Li, D. He, W. Chen, T.-Y. Liu, A theoretical analysis of ndcg ranking measures, in: Proceedings of the 26th Annual Conference on Learning Theory (COLT 2013), volume 8, Citeseer, 2013, p. 6.

    [46]

    Q. Wu, C.J. Burges, K.M. Svore, J. Gao, Ranking, Boosting, and Model Adaptation, Technical Report, Technical report Microsoft Research, 2008.

    [47]

    T. Chen, T. He, M. Benesty, V. Khotilovich, Y. Tang, H. Cho, et al., Xgboost: extreme gradient boosting, R package version 0.4–2 1, 2015, pp. 1–4.

    [48]

    B.R. Gunnarsson, S. Vanden Broucke, B. Baesens, M. Óskarsdóttir, W. Lemahieu, Deep learning for credit scoring: do or don’t?, Eur. J. Oper. Res. 295 (2021) 292–305.

    [49]

    A.C. Bahnsen, D. Aouada, B. Ottersten, A novel cost-sensitive framework for customer churn predictive modeling, Decis. Anal. 2 (2015) 1–15.

    [50]

    A.C. Bahnsen, D. Aouada, B. Ottersten, Example-dependent cost-sensitive decision trees, Expert Syst. Appl. 42 (2015) 6609–6619.

    [51]

    IBM Sample Data Sets (2017): Telco Customer Churn, Version 1. Retrieved October 10, 2021 from https://www.kaggle.com/blastchar/telco-customer-churn/version/1.

    [52]

    B. Baesens, D. Roesch, H. Scheule, Credit Risk Analytics: Measurement Techniques, Applications, and Examples in SAS, John Wiley & Sons, 2016.

    [53]

    G. Petrides, D. Moldovan, L. Coenen, T. Guns, W. Verbeke, Cost-sensitive learning for profit-driven credit scoring, J. Oper. Res. Soc. (2020) 1–13.

    [54]

    I.-C. Yeh, C.-H. Lien, The comparisons of data mining techniques for the predictive accuracy of probability of default of credit card clients, Expert Syst. Appl. 36 (2009) 2473–2480.

    [55]

    S. Moro, P. Cortez, P. Rita, A data-driven approach to predict the success of bank telemarketing, Decis. Support. Syst. 62 (2014) 22–31.

    [56]

    A. Dal Pozzolo, O. Caelen, R.A. Johnson, G. Bontempi, Calibrating probability with undersampling for unbalanced classification, in: 2015 IEEE Symposium Series on Computational Intelligence, IEEE, 2015, pp. 159–166.

    [57]

    V. Van Vlasselaer, C. Bravo, O. Caelen, T. Eliassi-Rad, L. Akoglu, M. Snoeck, B. Baesens, Apate: a novel approach for automated credit card transaction fraud detection using network-based extensions, Decis. Support. Syst. 75 (2015) 38–48.

    Digital Library

    [58]

    J. Demšar, Statistical comparisons of classifiers over multiple data sets, the, J. Mach. Learn. Res. 7 (2006) 1–30.

    [59]

    S. Garcia, F. Herrera, An extension on“ statistical comparisons of classifiers over multiple data sets” for all pairwise comparisons, J. Mach. Learn. Res. 9 (2008).

    [60]

    S. García, A. Fernández, J. Luengo, F. Herrera, Advanced nonparametric tests for multiple comparisons in the design of experiments in computational intelligence and data mining: experimental analysis of power, Inf. Sci. 180 (2010) 2044–2064.

    [61]

    C. Derman, G.J. Lieberman, S.M. Ross, A sequential stochastic assignment problem, Manag. Sci. 18 (1972) 349–355.

    [62]

    C. Albright, C. Derman, Asymptotic optimal policies for the stochastic sequential assignment problem, Manag. Sci. 19 (1972) 46–51.

    [63]

    S.C. Albright, Optimal sequential assignments with random arrival times, Manag. Sci. 21 (1974) 60–67.

    Recommendations

    • Improved classification with allocation method and multiple classifiers

      We propose a new allocation method for building a classification ensemble.Allocation method uses multiple classifiers: the allocator and micro classifiers.Allocator separates the dataset and allocates them to one of micro classifiers.Allocator is based ...

      Read More

    • Machine Learning Approaches for Early DRG Classification and Resource Allocation

      Recent research has highlighted the need for upstream planning in healthcare service delivery systems, patient scheduling, and resource allocation in the hospital inpatient setting. This study examines the value of upstream planning within hospital-wide ...

      Read More

    • A novel Bagged Naïve Bayes-Decision Tree approach for multi-class classification problems

      Soft Computing and Intelligent Systems: Techniques and Applications

      Breakthrough classification performances have been achieved by utilizing ensemble techniques in machine learning and data mining. Bagging is one such ensemble technique that has outperformed single models in obtaining higher predictive performances. This ...

      Read More

    Comments

    Information & Contributors

    Information

    Published In

    A new perspective on classification (1)

    Decision Support Systems Volume 179, Issue C

    Apr 2024

    245 pages

    ISSN:0167-9236

    Issue’s Table of Contents

    Elsevier B.V.

    Publisher

    Elsevier Science Publishers B. V.

    Netherlands

    Publication History

    Published: 02 July 2024

    Author Tags

    1. Machine learning
    2. Optimal resource allocation
    3. Classification
    4. Learning to rank

    Qualifiers

    • Research-article

    Contributors

    A new perspective on classification (2)

    Other Metrics

    View Article Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Total Citations

    • Total Downloads

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0

    Other Metrics

    View Author Metrics

    Citations

    View Options

    View options

    Get Access

    Login options

    Check if you have access through your login credentials or your institution to get full access on this article.

    Sign in

    Full Access

    Get this Publication

    Media

    Figures

    Other

    Tables

    A new perspective on classification (2024)
    Top Articles
    Latest Posts
    Article information

    Author: Lakeisha Bayer VM

    Last Updated:

    Views: 5737

    Rating: 4.9 / 5 (49 voted)

    Reviews: 80% of readers found this page helpful

    Author information

    Name: Lakeisha Bayer VM

    Birthday: 1997-10-17

    Address: Suite 835 34136 Adrian Mountains, Floydton, UT 81036

    Phone: +3571527672278

    Job: Manufacturing Agent

    Hobby: Skimboarding, Photography, Roller skating, Knife making, Paintball, Embroidery, Gunsmithing

    Introduction: My name is Lakeisha Bayer VM, I am a brainy, kind, enchanting, healthy, lovely, clean, witty person who loves writing and wants to share my knowledge and understanding with you.