The logic behind artificial intelligence


In reality, the ultimate objective of Artificial Intelligence in recruitment is to ensure that tens, hundreds or even thousands of applications are processed as objective and as efficient as possible:

based on a certain amount of data,

by applying consistent processing to this data from one application to another,

by freeing oneself from all representations / stereotypes and cognitive biases.

The formalization of this systematic processing applied to the data is what is called an algorithm.

The algorithms in question …

An algorithm is (according to Wikipedia): “A finite and unambiguous sequence of operations or instructions allowing to solve a class of problems.”

One might think at first glance that it is a little “mechanical” or even primary as a way of operating, in particular when it comes to processing applications (with “real people” behind each of them) …

One can indeed easily imagine a somewhat simplistic algorithm of the type:

If the candidate presents characteristic A (eg Education = Business School) then 20 points are added to his candidacy.

If then he has the characteristic B (eg Experience in the same sector as the company that wants to recruit) then we add 40 points.

And so on…

What should be understood is that algorithms can also be extremely sophisticated and elegant. For example, they can include “nested logics” such as:

If A is between 70 and 100

AND that B has a value between 40 and 60

AND that C …

AND that D….

AND / OR that …

AND that… This almost ad infinitum.

But there, even if we already far exceed the data processing capacities of which the brains of 100% of recruiters (and of any human being) would be capable, we still remain at a fairly basic level of algorithm!

One can indeed – when one parameterizes an algorithm – easily to integrate other types of processing which involve in particular the computation of correlations, of linear regressions on a pool of data in order to extract from it threshold values ​​which one can then apply to the particular treatment of a set of candidates for a given post.

Note that artificial intelligence in recruitment is also extremely useful for identifying – before the launch of a mission – the factors that condition success and commitment for a given position, in a specific context.

Basically, AI can help list the qualities required to be successful and be engaged in a particular job. The air of nothing… it is nevertheless the essential basic condition for a successful recruitment. Without it, good luck finding the rare pearl!

Machine Learning: How Machines Can Learn On Their Own!

Machine Learning can be defined as: “A set of mathematical and statistical approaches to give a computer system the capacity to learn from data.”

Machine Learning, even if it is clearly underexploited (or not at all exploited) by most publishers of HR Tech solutions today, nevertheless constitutes one of the most promising branches of Artificial Intelligence applied to the recruitment.

Basically, Machine Learning in recruiting is what allows you to “learn from previous recruitments”.

Applied systematically, Machine Learning has the power to gradually refine its selection criteria – for a given position – to tend towards increasingly powerful capacities to “predict the success and commitment of people. ”In this given post.

What factors impact the quality of machine learning?

The factors which determine the quality and the discriminating character or not of a machine learning device are mainly of 2 orders:

The type of data that is used,

The representativeness of the database on which machine learning is conducted.

The type of data used in machine learning

It goes without saying that in order not to introduce perverse effects, Machine Learning must be exercised on data which does not “potentially” carry a direct (age, sex, etc.) or indirect (address, schools) discriminating character. , experiences etc …)

The direct criteria, if they are defined as exclusion criteria, directly exclude certain categories of individuals, without the absolute link with the actual performance in post being established.

Criteria are said to be “indirect” because even if they do not seem to target a specific category of people, this can still be the case. If we take the address for example, it goes without saying that if I exclude all applications from the department of Seine-Saint-Denis (93), I will mostly rule out applications from candidates from. are neighborhoods and / or immigration (whether 1st, 2nd or 3rd generation).

The ideal, when working on the selection of people for hiring, is to run the machine learning system on data that is not (or little) impacted by one or the other. criteria considered as factors of discrimination in hiring by the legislation of the country in which one operates.

The specific case of psychological and behavioral variables

Psychological and behavioral characteristics, on the other hand, constitute a particularly interesting alternative. This is for the simple reason that these characteristics are found rather very well distributed, this almost whatever the criteria which one uses to define a population.

If I choose to preselect my candidates on the basis of their cognitive abilities as assessed through a standardized test rather than going through the criterion “has done such or such school”, I will inevitably neutralize the factor “ was bred by socially successful parents ”. This still represents a step forward in the context of fairness in recruitment.

Likewise, if I run Machine Learning on a set of data that includes this information (the cognitive ability of individuals) rather than information on the school attended, the system will ultimately tend to offer me applications. much more diverse in terms of gender, origin or age of people.

The representativeness of the database on which the learning is carried out

This criterion is particularly interesting to study. If in my company, for a specific job I have tended to recruit mainly men who have gone through an engineering school and I am launching a Machine Learning algorithm on the basis of their CV … in your opinion what is what will come out?

There is a strong probability that among the criteria that emerge there is the fact of having passed through such or such engineering school. However, the benches of engineering schools are mainly made up of young white men from privileged families.

If diversity is something important to me (besides being – by the way – a legal obligation), then I may have more interest in integrating other factors that – as seen above – are less likely to be impacted by the social background of applicants.

If, on the other hand, I analyze the data from the results of cognitive tests on my population of engineers, I will undoubtedly realize that the Machine Learning algorithm will bring out a criterion of the type: “Superior cognitive capacities”.

What must be understood is that if the analysis criteria are criteria not impacted by potentially discriminating variables, the representativeness of the sample becomes quite secondary.

Why ? Well because basically, it doesn’t matter that I have – when I do the analysis in my company – mainly “young white men from well-established families” in my workforce.

Considering that what emerges as a criterion is “superior cognitive capacities”. Also considering that the proportion of people who have superior cognitive abilities is as high among men as among women, among people living in Paris 16 or Neuilly as in Marseille, Bondy or Mirail …

If I choose to apply this “cognitive aptitude” criterion in the future, while dropping the “engineering school” criterion, that does not risk pushing me only to select in the future “young white men whose parents come from higher social classes”. On the contrary, I will mechanically increase the diversity within my teams! (Provided, of course, that I also diversify my sourcing).

Admittedly the basic sample (people in my box at the start) was “strongly characterized” but it made it possible to highlight a distinctive characteristic which is not “specific to this population in particular”. In a way, the exercise made it possible to bring out a “Universal” characteristic.

This just to understand that the argument of “it takes a huge database to do Machine Learning and the base population has to be sufficiently diverse otherwise it doesn’t work”… is just incorrect!