{"id":3693,"date":"2019-10-18T10:02:53","date_gmt":"2019-10-18T09:02:53","guid":{"rendered":"https:\/\/www.alvantia.com\/?p=3693"},"modified":"2019-10-18T10:02:57","modified_gmt":"2019-10-18T09:02:57","slug":"meta-algorithms-and-base-models-in-machine-learning","status":"publish","type":"post","link":"https:\/\/www.alvantia.com\/en\/meta-algorithms-and-base-models-in-machine-learning\/","title":{"rendered":"Meta algorithms and base models in Machine Learning"},"content":{"rendered":"\n<p class=\"has-normal-font-size\">A traditional, basic approach in the problems of <strong><span style=\"color:#313131\" class=\"tadv-color\">machine learning<\/span><\/strong> is to use a single determined algorithm or model in the implementation of a valid solution. This traditional method is based on creating a pipeline of all the processes that are covered by the development of this algorithm.<\/p>\n\n\n\n<!--more-->\n\n\n\n<p class=\"has-normal-font-size\">This pipeline consists of several phases:\nETL processing, data analysis, model implementation and training, results\nanalysis based on determined metrics and other purging mechanisms, and lastly,\nthe launching into production of the model.<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/pipeline.png\" alt=\"\" class=\"wp-image-3598\" width=\"248\" height=\"248\" srcset=\"https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/pipeline.png 502w, https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/pipeline-150x150.png 150w, https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/pipeline-300x300.png 300w, https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/pipeline-350x350.png 350w\" sizes=\"auto, (max-width: 248px) 100vw, 248px\" \/><\/figure><\/div>\n\n\n\n<p class=\"has-normal-font-size\">Primarily, during the research and implementation\nphase of the model, one or several determined algorithms are used as a\npotential solution to the problem given its category. Naturally, regression\nalgorithms will be chosen to predict the outcome of a quantitative variable. In\na more elaborate approach, where the number of features is so large that\nmonitored analysis and interpretation of the results takes on a higher degree\nof complexity, it is possible to split the problem into various phases by using\nmeta algorithms.<\/p>\n\n\n\n<p class=\"has-normal-font-size\">In this situation, the cycle of testing,\nadjusting and error correction is fundamental in the search for the algorithm\nor model with the correct hyper-parameters.<\/p>\n\n\n\n<p class=\"has-normal-font-size\">A parallel solution is the <strong><span style=\"color:#313131\" class=\"tadv-color\">composition of base algorithms and a superior meta algorithm.<\/span><\/strong> This set of algorithms forms a stack, where the meta algorithm is the last link in the chain, whereas the base algorithms will be the ones implemented simultaneously to feed their input.<\/p>\n\n\n\n<p class=\"has-normal-font-size\">In this architectural approach, base\nalgorithms use different sets of input features; some of them may be shared. In\nthis way, we will have different models for each set of determined features.\nThe way to separate these sets is based on the prior analysis of the data, or\nfailing this, on the intake source .<\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/training-set.png\" alt=\"\" class=\"wp-image-3600\" width=\"481\" height=\"428\" srcset=\"https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/training-set.png 656w, https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/training-set-300x267.png 300w\" sizes=\"auto, (max-width: 481px) 100vw, 481px\" \/><\/figure><\/div>\n\n\n\n<p class=\"has-normal-font-size\">In practice, a non-trivial problem such as\nthe detection of fraudulent transactions or financial fraud, the amount of\nattributes to be handled can become uncontrollable. Using the meta algorithm\nstructure, a type of algorithm is determined for each data source or set of\ndetermined features and to be able to establish which features are essential or\ncan boost the development of the final model. <\/p>\n\n\n\n<p class=\"has-normal-font-size\">Moreover, different models can be studied\nfor different sets of attributes and be analysed to improve the correlation\nbetween them. This maximises the capacity to select the correct model and set\nof predominant features in the development of the model.<\/p>\n\n\n\n<p class=\"has-normal-font-size\">In this architecture, the meta algorithm\nuses the percentage of precision or metric associated to each of the base\nalgorithms, using the binary label as a target, in this case, true or false.<\/p>\n\n\n\n<p class=\"has-normal-font-size\">In the analysis of meta algorithms it is necessary to use <strong><a href=\"http:\/\/A traditional, basic approach in the problems of machine learning is to use a single determined algorithm or model in the implementation of a valid solution. This traditional method is based on creating a pipeline of all the processes that are covered by the development of this algorithm.  This pipeline consists of several phases: ETL processing, data analysis, model implementation and training, results analysis based on determined metrics and other purging mechanisms, and lastly, the launching into production of the model.     Primarily, during the research and implementation phase of the model, one or several determined algorithms are used as a potential solution to the problem given its category. Naturally, regression algorithms will be chosen to predict the outcome of a quantitative variable. In a more elaborate approach, where the number of features is so large that monitored analysis and interpretation of the results takes on a higher degree of complexity, it is possible to split the problem into various phases by using meta algorithms.  In this situation, the cycle of testing, adjusting and error correction is fundamental in the search for the algorithm or model with the correct hyper-parameters.  A parallel solution is the composition of base algorithms and a superior meta algorithm. This set of algorithms forms a stack, where the meta algorithm is the last link in the chain, whereas the base algorithms will be the ones implemented simultaneously to feed their input.  In this architectural approach, base algorithms use different sets of input features; some of them may be shared. In this way, we will have different models for each set of determined features. The way to separate these sets is based on the prior analysis of the data, or failing this, on the intake source .     In practice, a non-trivial problem such as the detection of fraudulent transactions or financial fraud, the amount of attributes to be handled can become uncontrollable. Using the meta algorithm structure, a type of algorithm is determined for each data source or set of determined features and to be able to establish which features are essential or can boost the development of the final model.   Moreover, different models can be studied for different sets of attributes and be analysed to improve the correlation between them. This maximises the capacity to select the correct model and set of predominant features in the development of the model.  In this architecture, the meta algorithm uses the percentage of precision or metric associated to each of the base algorithms, using the binary label as a target, in this case, true or false.  In the analysis of meta algorithms it is necessary to use SHAP (SHapely Additive exPlanations) to determine which of the base algorithms is most relevant to produce actual predictions and labels.  SHAP is an approach for explaining the results of the models or algorithms of machine learning. It assimilates game theory with local explanation methods to represent possible expectation-based features. It detects the existing correlation and the value provided by the features in the results of the model.  SHAP paper: https:\/\/arxiv.org\/pdf\/1802.03888.pdf  In the following example, it is possible to observe which attributes contribute best to the learning model.     1.\tLink to the original image.  The sex and age features are implicitly correlated in the model\u2019s inference depending on their age range. In this example, a man with an age of approximately 60 will obtain a greater impact on the model\u2019s development. Whereas a female of the same age will not stand out. Going back to the case of financial fraud detection, this architecture, apart from having the advantages shown, has some disadvantages.  The main disadvantage is presented during the implementation, training and testing phase. This is due to the stack of base algorithms to be implemented and taught. Code development becomes fairly complex, training computationally complicated and we are conditioned by the source and volume of data.  The time necessary for the development of the entire set of algorithms is much longer-lasting than the implementation of traditional measures.  However, as a phase for research, executing batch processes for processing meta algorithms with different configurations of models and duly programmed sets of features, it is possible to achieve solutions to more complex problems.  As assistance for the development of this architecture, Python\u2019s MLxtend library (http:\/\/rasbt.github.io\/mlxtend\/) has inherited all the Scikit-Learn algorithms (https:\/\/scikit-learn.org\/stable\/) to implement the model stack and feed a principal meta algorithm.  Due to the complexity of this paradigm, it is necessary to start the research phase with simpler solutions. In the principle of parsimony, more well-known as Occam's razor:  \u201cAll things being equal, the simplest solution tends to be the best one.\u201c  This formula applied in the field of statistical and computational learning theory could be expressed as follows:  \u201cThe less complex a machine learning model is, the less probable it will be that a good empirical result is simply due to the peculiarities of the sample taken.\u201d\">SHAP<\/a><\/strong> (SHapely Additive exPlanations) to determine which of the base algorithms is most relevant to produce actual predictions and labels.<\/p>\n\n\n\n<p class=\"has-normal-font-size\">SHAP is an approach for explaining the\nresults of the models or algorithms of machine learning. It assimilates game\ntheory with local explanation methods to represent possible expectation-based\nfeatures. It detects the existing correlation and the value provided by the features\nin the results of the model.<\/p>\n\n\n\n<p class=\"has-normal-font-size\">In the following example, it is possible to\nobserve which attributes contribute best to the learning model. <\/p>\n\n\n\n<div class=\"wp-block-image\"><figure class=\"aligncenter is-resized\"><img loading=\"lazy\" decoding=\"async\" src=\"https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/grafica-1024x724.png\" alt=\"\" class=\"wp-image-3602\" width=\"403\" height=\"285\" srcset=\"https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/grafica-1024x724.png 1024w, https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/grafica-300x212.png 300w, https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/grafica-768x543.png 768w, https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/grafica-750x531.png 750w, https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/grafica-1200x849.png 1200w, https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/grafica-1680x1188.png 1680w, https:\/\/www.alvantia.com\/wp-content\/uploads\/2019\/10\/grafica.png 1767w\" sizes=\"auto, (max-width: 403px) 100vw, 403px\" \/><\/figure><\/div>\n\n\n\n<p class=\"has-normal-font-size\">The sex\nand age features are implicitly correlated in the model\u2019s inference depending\non their age range. In this example, a man with an age of approximately 60 will\nobtain a greater impact on the model\u2019s development. Whereas a female of the\nsame age will not stand out.<\/p>\n\n\n\n<p class=\"has-normal-font-size\">Going\nback to the case of financial fraud detection, this architecture, apart from\nhaving the advantages shown, has some disadvantages.<\/p>\n\n\n\n<p class=\"has-normal-font-size\">The main\ndisadvantage is presented during the implementation, training and testing\nphase. This is due to the stack of base algorithms to be implemented and\ntaught. Code development becomes fairly complex, training computationally\ncomplicated and we are conditioned by the source and volume of data.<\/p>\n\n\n\n<p class=\"has-normal-font-size\">The time\nnecessary for the development of the entire set of algorithms is much\nlonger-lasting than the implementation of traditional measures.<\/p>\n\n\n\n<p class=\"has-normal-font-size\">However,\nas a phase for research, executing batch processes for processing meta\nalgorithms with different configurations of models and duly programmed sets of\nfeatures, it is possible to achieve solutions to more complex problems.<\/p>\n\n\n\n<p class=\"has-normal-font-size\">As assistance for the development of this architecture, <a href=\"http:\/\/rasbt.github.io\/mlxtend\/\"><strong>Python\u2019s MLxtend library<\/strong><\/a> has inherited all the <a href=\"https:\/\/scikit-learn.org\/stable\/\"><strong>Scikit-Learn algorithms<\/strong><\/a> to implement the model stack and feed a principal meta algorithm.<\/p>\n\n\n\n<p class=\"has-normal-font-size\">Due to\nthe complexity of this paradigm, it is necessary to start the research phase\nwith simpler solutions. In the principle of parsimony, more well-known as\nOccam&#8217;s razor:<\/p>\n\n\n\n<h5 class=\"wp-block-heading\"><strong><em><span style=\"color:#313131\" class=\"tadv-color\">\u201cAll things being equal, the simplest solution tends to be the best one.\u201c<\/span><\/em><\/strong><\/h5>\n\n\n\n<p class=\"has-normal-font-size\">This\nformula applied in the field of statistical and computational learning theory\ncould be expressed as follows:<\/p>\n\n\n\n<h5 class=\"wp-block-heading\"><strong><em><span style=\"color:#313131\" class=\"tadv-color\">\u201cThe less complex a machine learning model is, the less probable it will be that a good empirical result is simply due to the peculiarities of the sample taken.\u201d<\/span><\/em><\/strong><\/h5>\n<div class=\"clearfix\"><\/div>","protected":false},"excerpt":{"rendered":"<p>A traditional, basic approach in the problems of machine learning is to use a single determined algorithm or model in the implementation of a valid solution. This traditional method is based on creating a pipeline of all the processes that are covered by the development of this algorithm.<\/p>\n<p class=\"cv-read-more-button\"><a class=\"cv-button button is-standard color-accent has-icon icon-after\" href=\"https:\/\/www.alvantia.com\/en\/meta-algorithms-and-base-models-in-machine-learning\/\">Continue Reading<i class=\"button-icon icon-right-open-big\"><\/i><\/a><\/p>\n","protected":false},"author":5,"featured_media":3603,"comment_status":"closed","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[215,186,187],"tags":[37,315,171],"class_list":["post-3693","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-alvantia-en","category-alvantia-2","category-technology","tag-alvantia-en","tag-machine-learning-2","tag-software-en","not-single"],"aioseo_notices":[],"_links":{"self":[{"href":"https:\/\/www.alvantia.com\/en\/wp-json\/wp\/v2\/posts\/3693","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/www.alvantia.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/www.alvantia.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/www.alvantia.com\/en\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/www.alvantia.com\/en\/wp-json\/wp\/v2\/comments?post=3693"}],"version-history":[{"count":6,"href":"https:\/\/www.alvantia.com\/en\/wp-json\/wp\/v2\/posts\/3693\/revisions"}],"predecessor-version":[{"id":3699,"href":"https:\/\/www.alvantia.com\/en\/wp-json\/wp\/v2\/posts\/3693\/revisions\/3699"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/www.alvantia.com\/en\/wp-json\/wp\/v2\/media\/3603"}],"wp:attachment":[{"href":"https:\/\/www.alvantia.com\/en\/wp-json\/wp\/v2\/media?parent=3693"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/www.alvantia.com\/en\/wp-json\/wp\/v2\/categories?post=3693"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/www.alvantia.com\/en\/wp-json\/wp\/v2\/tags?post=3693"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}