AI and Entrepreneurship: Facial Recognition Technology Detects Entrepreneurs, Outperforming Human Experts

doi:10.21203/rs.3.rs-4926308/v1

AI and Entrepreneurship: Facial Recognition Technology Detects Entrepreneurs, Outperforming Human Experts

2024 · doi:10.21203/rs.3.rs-4926308/v1

preprint OA: closed

Full text JSON View at publisher

Full text 145,126 characters · extracted from preprint-html · click to expand

AI and Entrepreneurship: Facial Recognition Technology Detects Entrepreneurs, Outperforming Human Experts | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article AI and Entrepreneurship: Facial Recognition Technology Detects Entrepreneurs, Outperforming Human Experts Martin Obschonka, Christian Fisch, Tharindu Fernando, Clinton Fookes This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4926308/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract Occupational outcomes like entrepreneurship are generally considered personal information that individuals should have the autonomy to disclose. With the advancing capability of artificial intelligence (AI) to infer private details from widely available human-centric data, such as social media, it is crucial to investigate whether AI can accurately extract private occupational information from such data. In this study, we demonstrate that deep neural networks can classify individuals as entrepreneurs based on a single facial image with high accuracy in data sourced from Crunchbase, a premier source for entrepreneurship data. Utilizing a dataset comprising facial images of 40,728 individuals, including both entrepreneurs and non-entrepreneurs, we trained a Convolutional Neural Network (CNN) and evaluated its classification performance. While human experts (n = 650) and trained participants (n = 133) were unable to classify entrepreneurs with accuracy above chance levels (> 50%), the AI model achieved a classification accuracy of 79.51%. Several robustness tests show that this high level of accuracy is maintained under various conditions. Artificial intelligence AI facial recognition technology deep learning entrepreneur entrepreneurship Convolutional Neural Network (CNN) Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 1. Introduction The ability of artificial intelligence (AI), such as deep learning, to extract complex and valid information from human-centric data has emerged as a pivotal research topic across various scientific disciplines. As AI technologies evolve, their applications in analyzing vast amounts of human-centric data—such as facial images, social media text, medical records, and other forms of digital interactions—have expanded exponentially. Recent advancements in AI, for instance, have leveraged facial recognition technology to infer a range of basic personal attributes (e.g., gender, age, smile) from individuals’ facial images [ 1 ]. Furthermore, an increasing number of studies demonstrate that AI models can also identify more latent and private personal attributes from facial images with accuracy levels significantly surpassing those of humans. For example, research has reported such beyond-human accuracy for latent personal attributes like personality [ 2 ], sexual orientation [ 3 ], and political orientation [ 4 , 5 ]. These developments raise significant ethical concerns, in addition to concerns regarding privacy, civil liberties, and the actual prediction scope of AI [ 4 , 5 , 6 , 7 ]. This highlights the need to better understand the extent to which AI methods can detect private information in widely shared human-centric data and the associated ethical risks. To contribute to this debate, we expand research on AI’s ability to capture personal attributes to encompass the hitherto unexplored occupational domain. Specifically, we explore whether—in a specific dataset drawn from Crunchbase ( www.crunchbase.com ), a premier data source for entrepreneurship data that is commonly used by researchers and practitioners—deep learning can detect intricate occupational information about a person from facial images (here: entrepreneur vs. non-entrepreneur). We also evaluate the accuracy and robustness of this classification and compare it to human classification performance. Our focus is on distinguishing between entrepreneurs and non-entrepreneurs for several reasons. Firstly, this dichotomy represents a well-established broad taxonomy of occupations in society [ 8 , 9 ] that significantly influences an individual’s occupational choices. Secondly, although many individuals voluntarily disclose such occupational information (e.g., on social media platforms like LinkedIn or company websites), the information remains fundamentally private. So far, it remains unclear to what extent this information can be classified from widely shared human-centric data, particularly where individuals are unaware they might be revealing private details, using ubiquitous AI methods. Thirdly, entrepreneurs and their ventures play a critical role in the economy, particularly in innovation and job creation [ 10 , 11 ], making them important subjects for research, education, and policy making. Lastly, there is considerable interdisciplinary research interest in private personal attributes as correlates of entrepreneurship [ 12 , 13 , 14 , 15 , 16 ]. Note that we deliberately refrain from interpreting the data regarding systematic facial differences. Such interpretations can be fundamentally biased and would not adhere to currently established, yet still evolving scientific and ethical standards [ 6 , 17 , 18 ]. Thus, we solely focus on testing whether deep learning can classify entrepreneurs vs. non-entrepreneurs with an accuracy that is (a) above random chance and (b) above the accuracy achieved by human experts using a specific dataset sourced from Crunchbase. This dataset is not fully representative of the general population and, importantly, could transfer social bias and discrimination in the real world into AI results, which could then reinforce or even amplify such social biases. While acknowledging the ethical sensitivity of our research, we also recognize its scientific and societal importance, especially when interpreted in the most ethical and scientific ways possible. As such, and like related research [ 4 , 5 ], we deem it important to carry out and report on such research to inform the public about the power of AI, as well as potential social, ethical, and methodological aspects of such methods, thereby contributing to ongoing complex discussions in this research space. 2. Methods This section outlines the methodological cornerstones of our study. We provide more technical details in our supplementary information (Appendix A), especially regarding the development and setup of our AI model. This study (including the online experiment with human entrepreneurship experts) was approved by the Research Ethics Committee of the Queensland University of Technology (approval no. 2000000651). The additional follow-up analysis with trained humans was approved by the Economics and Business Ethics Committee of the University of Amsterdam (approval no. EB-6781). All methods were performed in accordance with relevant guidelines and regulations. The identifiable face images that are shown in this paper (including supplementary information) were sourced from publicly available image sets or published with the informed consent of the subject. 2.1 AI model: data, design, and training 2.1.1 Data collection We collect a comprehensive dataset of facial images for training and testing the AI model. We draw our sample of facial images from Crunchbase, one of the premier databases used in contemporary entrepreneurship research [ 14 , 19 , 20 ]. Importantly, Crunchbase provides information on entrepreneurs and non-entrepreneurs (e.g., managers and other employees of entrepreneurial ventures, investors). Each individual has a profile page that contains demographic information and data on the individual’s employment and entrepreneurship history. Crunchbase also displays profile pictures (mostly facial images) that are publicly accessible. We retrieved our sample of facial images from Crunchbase in March 2019, considering individuals with a CB rank between 1 and 100,000. The CB rank is an internal identifier that Crunchbase uses to indicate prominent individuals. Among those individuals, we only considered individuals located in the US and with non-missing gender information. This process yielded an initial sample of 42,043 individuals. To identify entrepreneurs, we used information on the number of organizations that the individual had founded [ 21 ]. Hence, we characterized individuals who had founded at least one organization as entrepreneurs (n = 25,071, 59.6%). Our initial sample was reduced to 40,728 when removing individuals without processable facial images in Crunchbase (e.g., placeholders, comic images). 81% (19%) of our facial images refer to male (female) individuals. Using facial images as a data source has only recently gained attention in entrepreneurship research. Specifically, entrepreneurship research has begun to use facial images to capture emotions [ 22 , 23 ] and attributes such as attractiveness, competence, and intelligence [ 24 ]. Additionally, a recent study analyzes indicators of facial geometry (i.e., facial width-to-height ratio, cheekbone prominence, facial symmetry) and facial appearance to predict whether an individual emerges as an entrepreneur and entrepreneurial success [ 25 ]. 2.1.2 Data pre-processing We pre-process the raw data to filter out irrelevant information and clean the data. Specifically, we resize all facial images to a uniform size of 224×224 pixels, which is a standard input size used in many machine learning and face-related applications [ 26 , 27 ]. This resizing is also necessary to match the input size of the pre-trained feature extraction model that we later use. Moreover, facial images can contain additional information apart from the individual’s face (e.g., background, other body parts). Therefore, we leverage a face detector [ 28 ] to detect the face region in each image. The face detector outputs the coordinates of the face in the image, which can then be used to crop each facial image so that it only contains the face region (for more technical information, see Supplementary Information: Appendix A). 2.1.3 Feature extraction We then extract meaningful representations (i.e., features) from the high-dimensional raw input data, which is then used by the AI model to distinguish entrepreneurs and non-entrepreneurs. As our facial feature extractor, we use a VGG-Face 2 model [ 26 ], which is a prominent deep learning model that was pre-trained on a large dataset of facial images. 2.1.4 Classifier design and training We adopt a Convolutional Neural Network (CNN) architecture as our classifier. The objective of the CNN is to further refine the features that the pre-trained feature extractor has extracted and to identify the task-specific features. This network first analyses small parts of the input image, recognizing common patterns such as edges or distinct colors in the input. These features are hierarchically combined and build a spatial hierarchy of features. This is performed using pooling layers in CNNs which downsample the located features, reducing spatial dimensions while preserving important features. Therefore, CNNs can form complex shapes and object parts from simple edges and textures. Due to this property, CNNs are highly effective in extracting salient characteristics from images, which is why we utilize a CNN as our backbone network for feature extraction. In the proposed AI model, we need to compare two facial images. We leverage a shared CNN backbone to extract features to compare the two faces, one feature representation per image. The shared CNN backbone allows comparing the two images with respect to the common features that the CNN backbone has learned. After comparing the features of both inputs, our AI model compares the similarities and differences of the two faces. Figure 1 provides an overview of this CNN-based classification model. - Please insert Fig. 1 about here - In the subsequent classifier training stage, the AI model learns to automatically identify and extract task-specific informative features from this higher dimensional input space. Deep machine learning models are capable of learning hierarchical representations of features through multiple layers of abstraction. Low-level features like edges or textures are identified in the lower levels of the hierarchy. These features are then combined in the top layers of the hierarchy to form abstract features like object shapes. Deep learning models utilize trainable weights which are tuned during the training process to locate informative features. Each layer within the hierarchical structure of the deep machine learning model has several thousand of such weights which need to be optimized. Tuning these from scratch requires a large amount of training samples which is usually hard to obtain. As a solution, transfer learning approaches have emerged where pre-trained models that have been trained for a different but related task are leveraged for feature extraction. For example, face analysis tasks can leverage pre-trained face recognition models which have been extensively trained on large-scale datasets for detecting distinct facial characteristics. These models have learned from vast amounts of data during pre-training and can extract comprehensive and richer feature representation. We fine-tune our classification model to identify relevant task-specific features from the set of features that the pre-trained model extracts. Therefore, transfer learning reduces the need for an extensive collection of large-scale datasets (for more information related to transfer learning technology, see [ 29 ]). During the training, we provide pairs of facial images to our model. Each pair of images comprises one entrepreneur and one non-entrepreneur. We identify the samples within the pair of inputs as left-image and right-image. When generating the face image pairs, we ensure that both pairs belong to either male or female participants. We randomly alter the position that the entrepreneur’s face appears within the pair. As such it can appear either as the left input image or in the right input image. Our model outputs 0 (zero) if the left image is an entrepreneur and 1 (one) if the right image is an entrepreneur. Through the learning process, the model learns which features are important for making accurate comparisons, and our model can identify which image is of an entrepreneur. Using the entrepreneur and non-entrepreneur facial images in our dataset, we randomly pick pairs ensuring that each pair contains an entrepreneur and a non-entrepreneur. We follow the standard training and testing evaluation protocol in machine learning and randomly select 75% of these pairs for model training and the remaining 25% for model testing. A subset (10%) of the training data is held out as a validation set to evaluate the model’s performance on unseen data. Good performance on the validation set indicates good generalization of the model without overfitting. We then randomly initiate the weights of the model 10 times. In each repetition, the model achieves a different convergence point, reflecting our accuracy scores on the test data. The training and validation accuracy curves of the proposed model are provided in Figure SI1 (Supplementary Information: Appendix A). The curves show that there is no major divergence between the training and validation accuracies, which indicates that the model is not overfitting. 2.2 Experiment with human entrepreneurship experts We compare the accuracy of our AI model with human entrepreneurship experts to assess the beyond-human capacity of the AI model. Therefore, we designed a survey-based online experiment that mirrors the classification task that the AI model performed. The main part of the survey shows a consecutive set of pairs of facial images to participants (one entrepreneur and one non-entrepreneur per pair). Participants were asked to indicate which of the two individuals they think is the entrepreneur. The subsample of facial images that we used to construct the pairs of images used in our online experiment comprises approximately 2,150 male and 500 female individuals. These images are randomly selected from our test set (which comprises 25% of our total sample of Crunchbase images). Hence, these images were not used to train the model. In the online experiment, each participant was confronted with in total 10 pairs of facial images. Hence, if a participant correctly identified the entrepreneur in 10 out of 10 pairs of facial images, their accuracy was 100%, while random guessing across the 10 pairs should result in an accuracy of around 50% across respondents. Because our sample contains some famous entrepreneurs and non-entrepreneurs, we also included a question asking respondents to tick a respective box if they recognize any persons shown in the facial image pair. When calculating the accuracy scores, we removed all decisions in which the respondents indicated to know one of the individuals depicted. After the classification tasks, we collected information on participants’ age, country/region of origin, gender, the highest degree of education, the main field of education, and their main type of work experience (aside from potential investing activities). We also captured participants’ expertise by asking them to indicate which expert category they best fit in. Response options included (a) full-time entrepreneur, (b) part-time entrepreneur, (c) professional investor (i.e., venture capitalist or business angel), (d) other employees, (e) entrepreneurship researcher, (f) entrepreneurship educator, (g) student, (h) prefer not to say, (i) none of the above. Because we are interested in the performance of entrepreneurship experts, we only keep participants that self-assign to (a), (b), (c), (e), or (f). We solicited participants in two ways: First, we used Prolific ( www.prolific.com ) to recruit a sample of entrepreneurs. Second, we recruited participants by distributing the survey in the authors’ professional networks and via social media (addressing entrepreneurship circles). In total, we were able to collect responses from 650 human (self-assigned) entrepreneurship experts who made a total of 6,500 decisions. Because we remove all decisions in which the respondents indicated to know one of the individuals depicted, our analyses consider 6,431 out of 6,500. Table 1 displays a breakdown of our human entrepreneurship experts according to expert category, origin, age, gender, highest education degree, and main field of education. Table 1 Background information about the human entrepreneurship experts (n = 650) who participated in our online experiment. Item Category Counts Percent Expert group Entrepreneur 384 59.08 Entrepreneurship educator 92 14.15 Entrepreneurship researcher 143 22.00 Venture capitalist/business angel 31 4.77 Region Australia/Asia-Pacific 145 22.31 Europe 120 18.46 Middle East/North Africa 10 1.54 South America 11 1.69 USA/Canada 355 54.62 No response 9 1.38 Age 24 or younger 40 6.15 25 to 29 71 10.92 30 to 39 195 30.00 40 to 44 161 24.77 50 to 54 123 18.92 60 or older 59 9.08 No response 1 0.15 Gender Female 244 37.54 Male 375 57.69 Other 3 0.46 No response 28 4.31 Highest degree Below high school degree 5 0.77 High school degree or equivalent 87 13.38 Bachelor degree 174 26.77 MBA 45 6.92 Other Master degree/postgraduate 137 21.08 PhD/doctoral degree 182 28.00 Prefer not to say/no response 20 3.08 Field of education Business or economics 274 42.15 Humanities 31 4.77 Law 18 2.77 STEM 159 24.46 Social sciences 76 11.69 Other 73 11.23 No response 4 0.62 - Please insert Table 1 about here - 3. Results 3.1 AI model’s accuracy To assess the performance of the AI model in distinguishing entrepreneurs from non-entrepreneurs, we evaluate the accuracy as $$\:Accuracy=\:\frac{(TP+TN)}{(TP+FP+TN+FN)}\times\:100$$ where TP represents the count of correctly identified entrepreneurs (true positive), TN denotes the count of correctly identified non-entrepreneurs (true negative), FP represents the count of non-entrepreneurs identified as entrepreneurs (false positive) and FN denotes the count of entrepreneurs identified as non-entrepreneurs (false negative). The average accuracy of the AI model is obtained from randomly initializing the internal weights of the AI model and training the model 10 times. The accuracy of the model, when it converges, is taken as the accuracy of that trial. The accuracies obtained in the 10 trials are [80.30, 78.76, 79.28, 77.89, 79.98, 79.53, 79.52, 80.39, 79.20, 80.24], which yield an average accuracy of 79.51 (SD = 0.78). This suggests that when presented with a pair of images, our AI model can identify the entrepreneur with an accuracy of 79.51%, which is our main result. This accuracy is well above random guessing (i.e., 50%), indicating that our AI model is indeed able to identify systematic differences that distinguish the facial images of entrepreneurs from those of non-entrepreneurs. 3.2 Accuracy achieved by human entrepreneurship experts We then compare the AI model’s accuracy to the accuracy that human entrepreneurship experts would achieve on the same task, as captured via our online experiment. The experts we selected have expertise regarding entrepreneurs, making them the best human comparison group for our AI vs. humans test. Table 2 shows that the mean accuracy across all subgroups of human entrepreneurship experts is 49.42% (SD = 15.93). The accuracy is highest among entrepreneurship researchers (51.24%), while it is lowest among professional venture capitalists and business angels (43.87%). However, these differences are not very pronounced, so that the mean accuracy of the human judges is relatively homogeneous around or slightly below the mean value of 50%. Because each respondent was shown a set of pairs of facial images comprising one entrepreneur and one non-entrepreneur, this is equivalent to a random guess, indicating that human judges cannot systematically distinguish entrepreneurs from non-entrepreneurs. The result of the t-test in Table 2 documents that our human entrepreneurship experts achieve significantly lower accuracies than the AI model (p < 0.01), indicating that the AI model’s performance is indeed “beyond-human”. Table 2 Main results: mean accuracy of selected subgroups within our human judges and comparison with the performance of our AI model. Model/sample n (classifications) Mean accuracy (SD) t-test Human experts vs. AI model (p-value) AI model 10 (-) 79.51 (0.78) - Human experts 650 (6,431 a ) 49.42 (15.93) 5.92 (0.00) Entrepreneur 384 (3,791) 50.27 (15.66) 5.90 (0.00) Entrepreneurship educator 92 (911) 47.74 (17.35) 5.76 (0.00) Entrepreneurship researcher 143 (1,419) 51.24 (15.42) 5.78 (0.00) Venture capitalist/business angel 31 (310) 43.87 (14.30) 7.81 (0.00) Trained humans 133 (1,273) 48.12 (17.99) 5.50 (0.00) Notes: The human experts (n=650) did not undergo any specific training. The “trained humans” (n=133) were exposed to a brief training before participating in our online experiment. The t-test confirms a significant difference between the performance of human judges and AI model (p < 0.01). a = each classification refers to a human expert being shown a pair of facial images and indicating who they think is an entrepreneur. While every respondent performed 10 classifications, the number of classifications that we use to calculate the accuracies is slightly lower than 6,500 (=650 participants making 10 classifications each) because we remove those classifications in which respondents indicated that they know one of the facial images (i.e., recognized the entrepreneur or non-entrepreneur). - Please insert Table 2 about here - 3.3 Further analyses and robustness checks To shed some light on the functioning of the AI model, our findings’ robustness, and their validity, we perform a range of further analyses and robustness checks. We briefly summarize these analyses below and report more technical details in Supplementary Information: Appendix A. 3.3.1 Exploring the AI model’s decisions We observed considerable variability in the visual information that the AI model picked up and used to classify entrepreneurs and non-entrepreneurs. To illustrate, in Fig. 2 we create heatmaps for two pairs of entrepreneurs and non-entrepreneurs (pair 1: image a) and b), pair 2: image c) and d)) that highlight areas in the input images that are critical for the AI models’ decision-making. Each heatmap shows the top 50 sub-regions that contribute to the model decision. In addition, the sub-region boundaries are indicated in yellow. Considering this variability among the selected sub-regions by the AI model, we conducted a systematic analysis using generally accepted central facial landmarks (nose, eyes, mouth). In this analysis, we input only a single facial landmark or a combination of them, which reveals the most significant landmarks for the classification of entrepreneurs and non-entrepreneurs (for more technical information, see Appendix A2). The results in Fig. 3 indicate that the highest accuracy stems from the visual information associated with the nose region. Also, the accuracy improves when the visual information associated with facial landmarks is considered jointly. However, the combined model does not outperform our main model described in Section 3.1 , which uses the entire facial information as input. A potential explanation is that our main model possesses the capacity to also oversee different facial attributes such as skin textures, in addition to the important landmarks, and systematically attends to these salient attributes, learning complex non-linear relationships among these. - Please insert Figs. 2 and 3 about here - 3.3.2 Model bias: Gender and race Because AI models are prone to biases, we explicitly analyze the bias and sensitivity of the trained model towards gender and race. Given that most of our training data is from male individuals (81%), we assess whether the model achieves a higher (or lower) accuracy when evaluating male vs. female images. Assessing the gender-wise accuracy of the AI model on the test set, the AI model achieves an accuracy of 78.5% for male images, and 83.1% for female images. This indicates that the trained model performs equally regarding identifying male and female entrepreneurs. Going further, to better understand what features of the face region are extracted by the face classifier and to understand whether these identified features have any gender bias, we generate an embedding space visualization for a set of randomly chosen samples (Appendix A3). This analysis shows that the AI model’s learned embedding space separation is based on the ENT/Non-ENT labels and does not seem to be biased towards a specific gender. Racial bias is another area of concern, given that most individuals in our sample can be categorized as white. This implies that our AI model could be biased and perform differently for facial images that refer to non-white individuals. Because information on racial background is not included in Crunchbase, two authors manually inspected all the ~ 2,600 facial images used in our online experiment. Both researchers were tasked to independently identify and remove all facial images that depicted individuals that they would classify as white, leaving only facial images of individuals with racial backgrounds other than white. We then separately tested the trained model using these subsets of non-white images. It should be noted that these facial images have not been used for model training and were part of our testing set. The resulting accuracies for researcher 1 are 78.24% for male individuals (n = 489) and 80.75% for female individuals (n = 140). Similarly, the accuracies for researcher 2 are 77.90% for male individuals (n = 596) and 81.84% for female individuals (n = 176). These results align with our main results, indicating that the AI model does not seem to be heavily biased towards a certain racial background of the individuals in our dataset. 3.3.3 Situational factors Another major caveat is that our AI model could base its predictions on situational factors (see also [ 5 ]). Potential explanations could be that entrepreneurs use different head poses, deliberately employ certain facial expressions (e.g., smiling as impression management), have more professional photographs, or have more professional make-up or lighting than non-entrepreneurs. So far, our additional analyses (e.g., visual information associated with facial landmarks) indicate that the high accuracy of our model is likely due to facial morphology and potentially not due to situational factors (such as facial expression or head pose in the facial image), but this interpretation might be premature. To address this point in more detail, we randomly selected 10 entrepreneurs in our test set and artificially altered their expression, gaze, and emotions using the Hey-Photo ( https://hey-photo.com ) online editor which uses generative AI technology to alter the person’s expression, smile, and gaze in a given image. After altering the faces, we tested our model using the new 10 images and compared the performance difference in our model for bona fide and synthesized images. When considering the average change in model confidence in identifying the entrepreneur we observed only a 4.26% change from its original confidence level. As such, this provides some indication that our model is not biased towards the facial expressions and emotions of a given subject. 3.3.4 Public figures To provide a more illustrative example of the power of the AI model, we also present accuracy results for public figures (e.g., famous entrepreneurs). Note that we first defined a group of interesting cases, and then generated and reported the results for these cases. Hence, we did not engage in any sort of selective reporting (where one would only present particularly impressive results while omitting other tests and results [ 30 ]). First, we start with evaluating facial images of one of the most famous entrepreneurs currently active, Elon Musk. As illustrated in Fig. 4 (panel a), our model classifies Elon Musk as an entrepreneur with a probability of 98.8%, suggesting that the AI model is highly confident in its prediction. We repeat this for a selection of facial images of other famous entrepreneurs (Fig. 5 ), with similar results. In addition, in Fig. 4 , panels (b) to (d), we also analyze different images of Elon Musk, in which he shows different emotions/facial expressions and head poses than in the first facial image (which might indeed be interpreted as a very confident/optimistic look/expression). The accuracy results across the panels are almost identical, indicating again that the model is not swayed by situational factors (e.g., smiling or head posture) in a major way. We also do this for a selection of facial images of famous entrepreneurs shown in Fig. 5 (the modified facial images can be requested from the authors). Again, we observe that there are only minor fluctuations in the accuracy level, compared to the accuracy result for the original (real) facial images. Finally, given the recent discussions on entrepreneurial personalities in political leadership [ 31 , 32 , 33 ] and the relevance of individual differences in the political context [ 5 , 34 , 35 ], we conclude these additional analyses by examining facial images from a selection of political leaders (Fig. 6 ). With a high probability, the AI model (correctly) identifies the single politician among various political leaders that has a notable career as an entrepreneur. Hence, we report additional anecdotal evidence that the AI model identifies entrepreneurial individuals with high probability across these assessments of facial images from public figures (i.e., famous entrepreneurs and politicians). - Please insert Figs. 4 , 5 and 6 about here - 3.3.5 Authors of this study Finally, we took advantage of the diverse backgrounds of the author team in terms of entrepreneurial behavior. Two authors (the entrepreneurship scholars) had started their own businesses in the past, whereas the other two authors (the machine learning scholars) had not. The result is shown in Figure SI4 (Appendix A). Again, the AI model correctly assigns a high probability of entrepreneurship to the two authors with significant entrepreneurial tendencies in their occupational careers (own entrepreneurial behavior and entrepreneurship as the subject of their academic discipline), but not to those without such tendencies. 3.3.6 Testing trained humans While our AI model underwent an extensive training process (with data from the same dataset that was also used to test the accuracy of the AI model), the human experts were tested with data from the same dataset but did not undergo such training upfront. Hence, our results could be driven by this difference in training. There is at least the possibility that if human participants were given the chance to first investigate the facial images of entrepreneurs visually vs. non-entrepreneurs in this dataset, they would have been able to also spot systematic differences (making them trained humans). As a result, their performance in the classification experiment could improve. Therefore, we devised a brief training program to ‘level the playing field’. Specifically, we extracted a random sample of image pairs from our online classification experiment (48 male pairs, 12 female pairs, in line with the gender distribution in our full sample of facial images retrieved from Crunchbase). We prepared a presentation (PowerPoint slides) in which we included 12 entrepreneurs and 12 non-entrepreneurs per slide (= 24 facial images per slide). These images were labeled with the group labels (‘entrepreneur’ or ‘non-entrepreneur’) so that training participants were able to compare the facial images of entrepreneurs and non-entrepreneurs. We used these training slides, which are included in the Supplementary Information: Appendix B, in an in-person classroom setting in entrepreneurship and business bachelor’s and master’s courses at the University of Amsterdam and the University of Luxembourg. Using the large screen in front of the classroom, we exposed students to the training material for approximately 10 minutes, asking them to fully concentrate on the images to examine and memorize any existing group differences. After exposing the students to the training material, we asked them to participate in our online classification experiment. As described in Table 2 , we were able to collect responses from 133 individuals, making 1,273 classifications. The average accuracy is 48.12% (SD = 17.99). Thus, the training did not significantly enhance the performance of human participants, and the AI model still outperformed the trained humans by a large margin. In Fig. 7 , we summarize our study’s core results on the accuracy of the AI model versus human experts (main result), together with the trained experts (robustness check). We also provide context for the strength of the AI accuracy by comparing our results to other studies examining the accuracy of AI-supported face analysis in predicting other outcomes (e.g., political orientation). - Please insert Fig. 7 about here - 4. Concluding remarks 4.1 Discussion While the findings of any single study should be approached with caution, our research indicates that deep learning algorithms can discern occupational outcomes—specifically, distinguishing between entrepreneurs and non-entrepreneurs—from publicly available human-centric datasets like Crunchbase with substantial and above-chance accuracy (79.51%). Conversely, human raters did not exceed chance levels in a comparable task. This adds to our knowledge of the capabilities of AI in (a) extracting a whole range of private personal information from readily accessible human-centric data [ 2 , 3 , 4 , 5 ] and (b) outperforming humans, including experts, in such tasks. As highlighted by Kosinski [ 4 ], “one’s face is particularly difficult to hide in both interpersonal interactions and digital records,” making private information derived from facial images with substantial accuracy, including occupational details, a piece of sensitive information that can easily circulate within society, businesses, organizations, and among individuals. For example, entrepreneurial experience can have value as information because it has been linked to advantages in terms of access to entrepreneurial resources in (budding) entrepreneurs, including venture capital [ 36 , 37 ], increased venture survival [ 38 ], and public policy schemes [ 39 , 40 ]. While these are important research findings in themselves, it is also crucial to emphasize that the implications and application of such findings must be approached with utmost care. The potential for unintended consequences is high, and there are many pitfalls that one might not even be aware of. These pitfalls extend beyond social bias, such as stereotypes and discrimination, to include a range of societal risks. Misapplication of these findings can lead to significant ethical issues and negative impacts on society [ 6 , 17 , 18 ]. Therefore, our results should not be misinterpreted as broadly endorsing the widespread use of such AI methods, including facial recognition, for evaluating and classifying people. Testing these capabilities in our case does not imply reinforcement or recommendation in practice. The sole aim of this study was to assess AI’s capabilities compared to human performance within the limitations of the study design and potential social bias, which is a crucial consideration. For example, certain groups could be exposed to discrimination and stereotyping which could affect their likelihood of becoming an entrepreneur and hence representation in the dataset we used and the respective classification results. Social bias in the real world is well documented in a myriad of studies (e.g., in investment decisions affecting entrepreneurs [ 13 ] or the ‘what is beautiful is good’ stereotype [ 41 ]; see also [ 25 ]). We also acknowledge the well-documented dangers of following any ‘illusions’ of understanding AI-driven research results [ 42 ]. Hence, we highlight the findings as they are, focusing on the predictive capacity of AI in comparison to humans. However, it is essential to discuss them in the potential context of biases when considering the actual meanings and implications for society. We believe that our findings and this contextualized interpretation hinting at the potential of social bias add significant knowledge to the respective debates in society—given the unquestionable disruptive potential of such AI methods and data in the real world, along with their major ethical implications affecting large parts of society. 4.2 Limitations Our study has various limitations. First, although we ran several additional tests, we cannot say with absolute certainty what information the AI model picked up to distinguish entrepreneurs from non-entrepreneurs. For example, there is still a small chance, in our view, that facial expression or other situational factors could have played a role. However, as noted before, our main goal was not to identify the actual distinguishing features (e.g., what does an entrepreneurial face look like), but to test whether entrepreneurs are different at all and whether this can be reliably predicted in an AI vs. humans setting. Second, as with any empirical research project, the quality of the data that is used to train and test the AI model is critical. If the data is inaccurate or biased, the AI model will absorb these inconsistencies and generate flawed conclusions that may reproduce or amplify the biases present in the training data [ 43 , 44 ]. We use Crunchbase to retrieve and identify our sample of entrepreneurs and non-entrepreneurs. Crunchbase is one of the most ubiquitous databases used in contemporary entrepreneurship research due to its recent, accurate, and comprehensive coverage [ 14 , 19 , 20 , 21 ]. Despite its coverage, we acknowledge that the founding history recorded in Crunchbase might not be completely accurate for every individual. For example, some individuals that we classify as non-entrepreneurs might have participated in founding a new venture that is too insignificant to be recorded in Crunchbase. Others might deliberately omit founding information from their profile to masquerade past failures. This leads to a situation in which we might falsely classify some individuals as non-entrepreneurs even though they are entrepreneurs. Future research could try to circumvent this potential limitation by collecting training data from different sources (e.g., via surveys) or by performing thorough background checks on the individuals included in Crunchbase to verify the accuracy of their classification as entrepreneurs or non-entrepreneurs. We also acknowledge that the data included in Crunchbase that we use in our analyses might not to be free of bias or representative of the entire population of entrepreneurs and non-entrepreneurs. Specifically, the individuals that we classify as non-entrepreneurs represent prominent business professionals (e.g., CEOs, managers, investors) so that our sample of non-entrepreneurs is not a cross-section of the general population. Instead, our assessment is closer to a comparison between entrepreneurs and managers, which is a popular comparison in entrepreneurship research [ 45 , 46 ]. Moreover, potential biases in our data could stem from the fact that Crunchbase focuses mostly on tech ventures [ 14 ] in the US [ 19 ]. This suggests that our sample might be skewed towards entrepreneurs and non-entrepreneurs in the US tech sector. While we acknowledge that this sample might not be representative of the entire population of entrepreneurs, we want to emphasize that such ventures are a particularly important source of economic growth and innovation [ 14 ], so that our analysis is still impactful and relevant, even when considering this narrower scope. To summarize, while Crunchbase is a state-of-the-art database in contemporary entrepreneurship research and we are not aware of an alternative data source that would allow us to improve our model (i.e., from a technical, legal, and ethical standpoint), the caveats that we acknowledge need to be considered when interpreting the results. Third, our model is trained with one facial image per person. Using more images per person could change the effectiveness of the training. It might also make it possible to study the role of age. While our sample covers individuals across all age groups, we do not know how the algorithm behaves for facial images of the same individual over time. That is, there might be some bias in our predictions related to age. Finally, in our analysis we attempted to address a potential bias regarding gender and racial background in the AI model but that cannot completely rule out any biases. Examples include potential gender identity, disabilities, or ancestry. Because we are unable to reliable infer these characteristics from the data available in Crunchbase, we cannot consider them in our assessment. 4.3 Conclusion Together with a growing body of related findings, studies like ours show the potential of AI and underscore the need for robust ethical guidelines and regulatory frameworks to govern the use of AI and human-centric data. This includes extracting personal information from publicly available data to prevent misuse, protect individual privacy, and ensure broader ethical standards when using such AI methods with certain types of data and the underlying potential of social bias reflected in this data and the respective AI results. This also highlights the necessity for extreme caution regarding ethical risks in study designs, results, and their interpretation and application. As technology advances, it becomes imperative to balance its potential benefits with the deep ethical challenges it presents, ensuring that AI deployment respects individual privacy and aligns with societal values and ethical standards. Declarations Author contributions statement Martin Obschonka, Christian Fisch, Tharindu Fernando, and Clinton Fookes designed the research; Clinton Fookes developed the deep machine learning research idea and provided supervision; Martin Obschonka, Christian Fisch, and Tharindu Fernando performed the research and analyzed the data; Martin Obschonka, Christian Fisch, and Tharindu Fernando wrote the manuscript. All authors discussed the results. All authors reviewed and approved the manuscript. Competing interests statement The authors declare no competing interests. Data availability statement The data and code used in this study contain sensitive information and cannot be publicly shared. However, the code and data (except for the facial images) may be shared by the corresponding author upon reasonable request. References Ranjan, R. et al. Deep learning for understanding faces: machines may be just as good, or better, than humans. IEEE Signal Processing Magazine 35, 66–83 (2018). Moreno-Armendáriz, M. A., Martínez, C. A. D., Calvo, H. & Moreno-Sotelo, M. Estimation of personality traits from portrait pictures using the five-factor model. IEEE Access 8, 201649–201665 (2020). Wang, Y. & Kosinski, M. Deep neural networks are more accurate than humans at detecting sexual orientation from facial images. Journal of Personality and Social Psychology 114(2), 246–257 (2018). Kosinski, M. Facial recognition technology can expose political orientation from naturalistic facial images. Scientific Reports 11(1), 100 (2021). Kosinski, M., Khambatta, P. & Wang, Y. Facial recognition technology and human raters can predict political orientation from images of expressionless faces even when controlling for demographics and self-presentation. American Psychologist (2024). https://doi.org/10.1037/amp0001295 Levine, M., Philpot, R., Nightingale, S. J. & Kordoni, A. Visual digital data, ethical challenges, and psychological science. American Psychologist 79(1), 109–122 (2024). Santow, E. Emerging from AI utopia. Science 368, 9–9 (2020). Bosma, N. The Global Entrepreneurship Monitor (GEM) and its impact on entrepreneurship research. Foundations and Trends in Entrepreneurship 9(2), 143–248 (2013). McClelland, D. C. The achieving society (Van Nostrand Rinehold, 1961) Haltiwanger, J. Entrepreneurship in the twenty-first century. Small Business Economics 58(1), 27–40 (2022). Schumpeter, J. A. The theory of economic development (Harvard University Press, 1934). Baum, J. R. & Locke, E. A. The relationship of entrepreneurial traits, skill, and motivation to subsequent venture growth. Journal of Applied Psychology 89(4), 587–598 (2004). Brooks, A. W., Huang, L., Kearney, S. W. & Murray, F. E. Investors prefer entrepreneurial ventures pitched by attractive men. Proceedings of the National Academy of Sciences 111(12), 4427–4431 (2014). Freiberg, B. & Matz, S. C. Founder personality and entrepreneurial outcomes: a large-scale field study of technology startups. Proceedings of the National Academy of Sciences 120(19), e2215829120 (2023). Lazear, E. P. Balanced skills and entrepreneurship. American Economic Review 94(2), 208–211 (2004). Lindquist, M. J., Sol, J. & Van Praag, M. Why do entrepreneurial parents have entrepreneurial children? Journal of Labor Economics 33(2), 269–296 (2015). Landers, R. N. & Behrend, T. S. Auditing the AI auditors: a framework for evaluating fairness and bias in high stakes AI predictive models. American Psychologist 78(1), 36–49 (2023). Madan, S., Savani, K. & Johar, G. V. How you look is who you are: the appearance reveals character lay theory increases support for facial profiling. Journal of Personality and Social Psychology 123(6), 1223–1242 (2022). Ter Wal, A. L., Alexy, O., Block, J. & Sandner, P. G. The best of both worlds: the benefits of open-specialized and closed-diverse syndication networks for new ventures’ success. Administrative Science Quarterly 61(3), 393–432 (2016). Yu, S. How do accelerators impact the performance of high-technology ventures? Management Science 66(2), 530–552 (2020). Fisch, C. & Block, J. H. How does entrepreneurial failure change an entrepreneur’s digital identity? Evidence from Twitter data. Journal of Business Venturing 36(1), 106015 (2021). Momtaz, P. P. CEO emotions and firm valuation in initial coin offerings: an artificial emotional intelligence approach. Strategic Management Journal 42(3), 558–578 (2021). Warnick, B. J., Davis, B. C., Allison, T. H. & Anglin, A. H. Express yourself: facial expression of happiness, anger, fear, and sadness in funding pitches. Journal of Business Venturing 36(4), 106109 (2021). Colombo, M. G., Fisch, C., Momtaz, P. P. & Vismara, S. The CEO beauty premium: founder CEO attractiveness and firm valuation in initial coin offerings. Strategic Entrepreneurship Journal 16(3), 491–521 (2022). Stefanidis, D., Nicolaou, N., Charitonos, S. P., Pallis, G. & Dikaiakos, M. What’s in a face? Facial appearance associated with emergence but not success in entrepreneurship. The Leadership Quarterly 33(2), 101597 (2022). Cao, Q., Shen, L., Xie, W., Parkhi, O. M. & Zisserman, A. VGGFace2: a dataset for recognizing faces across pose and age. 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), 67–74 (2018). Rassadin, A., Gruzdev, A. & Savchenko, A. Group-level emotion recognition using transfer learning from face identification. Proceedings of the 19th ACM International Conference on Multimodal Interaction, 544–548 (2017). Zhang, K., Zhang, Z., Li, Z. & Qiao, Y. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10), 1499–1503 (2016). Acevez-Fernandez, M. A. Advances and applications in deep learning (IntechOpen, 2020). Bruns, S. B., Deressa, T. K., Stanley, T. D., Doucouliagos, C. & Ioannidis, J. P. Estimating the extent of selective reporting: an application to economics. Research Synthesis Methods 15(4), 590–602 (2024). Li, H., Meng, L. & Zhang, J. Why do entrepreneurs enter politics? Evidence from China. Economic Inquiry 44(3), 559–578 (2006). Nyström, K. Entrepreneurial politicians. Small Business Economics 41(1), 41–54 (2013). Obschonka, M. & Fisch, C. Entrepreneurial personalities in political leadership. Small Business Economics 50, 851–869 (2018). Aichholzer, J. & Willmann, J. Desired personality traits in politicians: similar to me but more of a leader. Journal of Research in Personality 88, 103990 (2020). Schoen, H. & Schumann, S. Personality traits, partisan attitudes, and voting behavior. Evidence from Germany. Political Psychology 28(4), 471–498 (2007). Hsu, D. H. Experienced entrepreneurial founders, organizational capital, and venture capital funding. Research Policy 36(5), 722–741 (2007). Zhang, J. The advantage of experienced start-up founders in venture capital acquisition: evidence from serial entrepreneurs. Small Business Economics 36, 187–208 (2011). Paik, Y. Serial entrepreneurs and venture survival: evidence from US venture‐capital‐financed semiconductor firms. Strategic Entrepreneurship Journal 8(3), 254–268 (2014). Baum, J. A. & Silverman, B. S. Picking winners or building them? Alliance, intellectual, and human capital as selection criteria in venture financing and performance of biotechnology startups. Journal of Business Venturing 19(3), 411–436 (2004). Shane, S. Why encouraging more people to become entrepreneurs is bad public policy. Small Business Economics 33, 141–149 (2009). Dion, K., Berscheid, E. & Walster, E. What is beautiful is good. Journal of Personality and Social Psychology 24(3), 285–290 (1972). Messeri, L. & Crockett, M. J. Artificial intelligence and illusions of understanding in scientific research. Nature 627, 49–58 (2024). Barocas, S. & Selbst, A. D. Big data’s disparate impact. California Law Review 104(3), 671–732 (2016). Manyika, J., Silberg, J. & Presten, B. What do we do about the biases in Al. Available at: https://hbr.org/2019/10/what-do-we-do-about-the-biases-in-ai (2019) Brandstätter, H. Personality aspects of entrepreneurship: a look at five meta-analyses. Personality and Individual Differences 51(3), 222–230 (2011). Stewart, Jr., W. H. & Roth, P. L. Risk propensity differences between entrepreneurs and managers: a meta-analytic review. Journal of Applied Psychology 86(1), 145–153 (2001). Additional Declarations No competing interests reported. Supplementary Files Appendix.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4926308","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":352817570,"identity":"79156e53-25cc-4501-a6a3-732ace05df2c","order_by":0,"name":"Martin Obschonka","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAvElEQVRIiWNgGAWjYHACNiA+wMDAw8D4AMxnJ0ELswGYz0yCFjYJorToNjAfe/ij5o6cOc/hZ9W8bXYM5oS0mB1gSzfmOfbM2LK3zew2b1syg2UzQS08ZtKMDYcTN5xnMLvNc4aZweAwEVokf4K1sH8r5jlTT5wWCV6QlrM9Zsw8FYeJ0HIY7JfDxgZnzhRLzqk4zkNYy/FmUIgdljM4k77xwxuDajmD4w0E9KAHKQ8B9aNgFIyCUTAKiAEAS2Q/wWalqKsAAAAASUVORK5CYII=","orcid":"","institution":"University of Amsterdam","correspondingAuthor":true,"prefix":"","firstName":"Martin","middleName":"","lastName":"Obschonka","suffix":""},{"id":352817571,"identity":"6e5d597a-d3f2-400c-9ff4-bcf6c92ac3d2","order_by":1,"name":"Christian Fisch","email":"","orcid":"","institution":"University of Luxembourg","correspondingAuthor":false,"prefix":"","firstName":"Christian","middleName":"","lastName":"Fisch","suffix":""},{"id":352817572,"identity":"3dcbaaba-60e8-4c10-bc35-aab3bb51939c","order_by":2,"name":"Tharindu Fernando","email":"","orcid":"","institution":"Queensland University of Technology","correspondingAuthor":false,"prefix":"","firstName":"Tharindu","middleName":"","lastName":"Fernando","suffix":""},{"id":352817573,"identity":"cfd1b4ee-0e6a-4be9-9c0a-68b35dca9e80","order_by":3,"name":"Clinton Fookes","email":"","orcid":"","institution":"Queensland University of Technology","correspondingAuthor":false,"prefix":"","firstName":"Clinton","middleName":"","lastName":"Fookes","suffix":""}],"badges":[],"createdAt":"2024-08-16 16:44:09","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4926308/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4926308/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":66587190,"identity":"c5af290c-050d-440a-b067-40afe76a9e85","added_by":"auto","created_at":"2024-10-14 14:24:01","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":123249,"visible":true,"origin":"","legend":"\u003cp\u003eArchitecture of the CNN-based classification model.\u003c/p\u003e","description":"","filename":"1.png","url":"https://assets-eu.researchsquare.com/files/rs-4926308/v1/b1891fa986bcfe5b1b658fad.png"},{"id":66587305,"identity":"d44af7e6-4f43-4ba9-95a7-6fd5e01ef99e","added_by":"auto","created_at":"2024-10-14 14:24:05","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":1225369,"visible":true,"origin":"","legend":"\u003cp\u003eExample visualizations highlighting the most salient sub-regions for the model decision (in red) and sub-region boundaries (in yellow) from individual cases.\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-4926308/v1/3c3fcfbdfab72ee4d2803c14.png"},{"id":66587188,"identity":"2771cb67-c1d0-4e25-84ce-3b7a2753dfeb","added_by":"auto","created_at":"2024-10-14 14:24:00","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":91666,"visible":true,"origin":"","legend":"\u003cp\u003eResults (accuracy) of the facial landmarks-based classifier.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eNotes: \u003c/em\u003e50% is the accuracy of chance.\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-4926308/v1/03152de12da27df2a1c250b9.png"},{"id":66587192,"identity":"7d14ab34-514e-4928-97c1-93c008fc58e9","added_by":"auto","created_at":"2024-10-14 14:24:02","extension":"png","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":948409,"visible":true,"origin":"","legend":"\u003cp\u003eAI model’s estimated probability for classifying facial images of Elon Musk as an entrepreneur. To assess the model’s robustness, we consider images that differ in terms of facial expression, appearance, lighting, and contrast.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eNotes\u003c/em\u003e: ENT = entrepreneur, Non-ENT = non-entrepreneur. The images in this figure are in the public domain and free from copyright restrictions. The images are real and were not altered.\u003c/p\u003e","description":"","filename":"4.png","url":"https://assets-eu.researchsquare.com/files/rs-4926308/v1/6cdb1df50b96de6fe521dee4.png"},{"id":66587189,"identity":"b94bb84c-7d00-4162-9662-d022977c9fb6","added_by":"auto","created_at":"2024-10-14 14:24:01","extension":"png","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":1729772,"visible":true,"origin":"","legend":"\u003cp\u003ePredictions by our AI model for facial images of famous male and female entrepreneurs. The AI model successfully categorizes all individuals as entrepreneurs.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eNotes\u003c/em\u003e: The images in this figure are in the public domain and free from copyright restrictions.\u003c/p\u003e","description":"","filename":"5.png","url":"https://assets-eu.researchsquare.com/files/rs-4926308/v1/b4269206743e6fecc23e3032.png"},{"id":66587930,"identity":"b11adc2f-8d47-44ed-b1af-268ff454a1dd","added_by":"auto","created_at":"2024-10-14 14:32:01","extension":"png","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":824200,"visible":true,"origin":"","legend":"\u003cp\u003eEstimated entrepreneurship trait from the deep learning model for political leaders. The model successfully identifies Trump’s history as (notable) entrepreneur.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003eNotes\u003c/em\u003e: The images in this figure are in the public domain and free from copyright restrictions.\u003c/p\u003e","description":"","filename":"6.png","url":"https://assets-eu.researchsquare.com/files/rs-4926308/v1/c62fb84f345a2d2e46378882.png"},{"id":66587156,"identity":"b7f619bc-725e-4778-9665-057ff8b5986f","added_by":"auto","created_at":"2024-10-14 14:24:00","extension":"png","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":141032,"visible":true,"origin":"","legend":"\u003cp\u003eAccuracy of the AI model vs. human experts and trained humans (present study). Right side for comparison: Accuracy of recent AI studies using face analysis techniques to predict outcomes.\u003c/p\u003e","description":"","filename":"7.png","url":"https://assets-eu.researchsquare.com/files/rs-4926308/v1/654cb830652b099993a49895.png"},{"id":66589453,"identity":"c0e7bcae-195f-43c9-947c-839189b26339","added_by":"auto","created_at":"2024-10-14 14:40:06","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":8296095,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4926308/v1/66ee5f15-ad24-47d3-b262-5adcd56531fc.pdf"},{"id":66587154,"identity":"4120acd6-707d-442f-a2a1-ef89ef8412a2","added_by":"auto","created_at":"2024-10-14 14:24:00","extension":"docx","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":8500696,"visible":true,"origin":"","legend":"","description":"","filename":"Appendix.docx","url":"https://assets-eu.researchsquare.com/files/rs-4926308/v1/fc4104b76c7707cc336adbeb.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"AI and Entrepreneurship: Facial Recognition Technology Detects Entrepreneurs, Outperforming Human Experts","fulltext":[{"header":"1. Introduction","content":"\u003cp\u003eThe ability of artificial intelligence (AI), such as deep learning, to extract complex and valid information from human-centric data has emerged as a pivotal research topic across various scientific disciplines. As AI technologies evolve, their applications in analyzing vast amounts of human-centric data\u0026mdash;such as facial images, social media text, medical records, and other forms of digital interactions\u0026mdash;have expanded exponentially. Recent advancements in AI, for instance, have leveraged facial recognition technology to infer a range of basic personal attributes (e.g., gender, age, smile) from individuals\u0026rsquo; facial images [\u003cspan class=\"CitationRef\"\u003e1\u003c/span\u003e]. Furthermore, an increasing number of studies demonstrate that AI models can also identify more latent and private personal attributes from facial images with accuracy levels significantly surpassing those of humans. For example, research has reported such beyond-human accuracy for latent personal attributes like personality [\u003cspan class=\"CitationRef\"\u003e2\u003c/span\u003e], sexual orientation [\u003cspan class=\"CitationRef\"\u003e3\u003c/span\u003e], and political orientation [\u003cspan class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e5\u003c/span\u003e]. These developments raise significant ethical concerns, in addition to concerns regarding privacy, civil liberties, and the actual prediction scope of AI [\u003cspan class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e5\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e7\u003c/span\u003e]. This highlights the need to better understand the extent to which AI methods can detect private information in widely shared human-centric data and the associated ethical risks.\u003c/p\u003e\n\u003cp\u003eTo contribute to this debate, we expand research on AI\u0026rsquo;s ability to capture personal attributes to encompass the hitherto unexplored occupational domain. Specifically, we explore whether\u0026mdash;in a specific dataset drawn from Crunchbase (\u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ewww.crunchbase.com\u003c/span\u003e\u003c/span\u003e), a premier data source for entrepreneurship data that is commonly used by researchers and practitioners\u0026mdash;deep learning can detect intricate occupational information about a person from facial images (here: entrepreneur vs. non-entrepreneur). We also evaluate the accuracy and robustness of this classification and compare it to human classification performance.\u003c/p\u003e\n\u003cp\u003eOur focus is on distinguishing between entrepreneurs and non-entrepreneurs for several reasons. Firstly, this dichotomy represents a well-established broad taxonomy of occupations in society [\u003cspan class=\"CitationRef\"\u003e8\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e9\u003c/span\u003e] that significantly influences an individual\u0026rsquo;s occupational choices. Secondly, although many individuals voluntarily disclose such occupational information (e.g., on social media platforms like LinkedIn or company websites), the information remains fundamentally private. So far, it remains unclear to what extent this information can be classified from widely shared human-centric data, particularly where individuals are unaware they might be revealing private details, using ubiquitous AI methods. Thirdly, entrepreneurs and their ventures play a critical role in the economy, particularly in innovation and job creation [\u003cspan class=\"CitationRef\"\u003e10\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e11\u003c/span\u003e], making them important subjects for research, education, and policy making. Lastly, there is considerable interdisciplinary research interest in private personal attributes as correlates of entrepreneurship [\u003cspan class=\"CitationRef\"\u003e12\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e13\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e14\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e15\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e16\u003c/span\u003e].\u003c/p\u003e\n\u003cp\u003eNote that we deliberately refrain from interpreting the data regarding systematic facial differences. Such interpretations can be fundamentally biased and would not adhere to currently established, yet still evolving scientific and ethical standards [\u003cspan class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e18\u003c/span\u003e]. Thus, we solely focus on testing whether deep learning can classify entrepreneurs vs. non-entrepreneurs with an accuracy that is (a) above random chance and (b) above the accuracy achieved by human experts using a specific dataset sourced from Crunchbase. This dataset is not fully representative of the general population and, importantly, could transfer social bias and discrimination in the real world into AI results, which could then reinforce or even amplify such social biases. While acknowledging the ethical sensitivity of our research, we also recognize its scientific and societal importance, especially when interpreted in the most ethical and scientific ways possible. As such, and like related research [\u003cspan class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan class=\"CitationRef\"\u003e5\u003c/span\u003e], we deem it important to carry out and report on such research to inform the public about the power of AI, as well as potential social, ethical, and methodological aspects of such methods, thereby contributing to ongoing complex discussions in this research space.\u003c/p\u003e"},{"header":"2. Methods","content":"\u003cp\u003eThis section outlines the methodological cornerstones of our study. We provide more technical details in our supplementary information (Appendix A), especially regarding the development and setup of our AI model.\u003c/p\u003e\n\u003cp\u003eThis study (including the online experiment with human entrepreneurship experts) was approved by the Research Ethics Committee of the Queensland University of Technology (approval no. 2000000651). The additional follow-up analysis with trained humans was approved by the Economics and Business Ethics Committee of the University of Amsterdam (approval no. EB-6781). All methods were performed in accordance with relevant guidelines and regulations. The identifiable face images that are shown in this paper (including supplementary information) were sourced from publicly available image sets or published with the informed consent of the subject.\u003c/p\u003e\n\u003cdiv id=\"Sec3\"\u003e\n \u003ch2\u003e2.1 AI model: data, design, and training\u003c/h2\u003e\n \u003cdiv id=\"Sec4\"\u003e\n \u003ch2\u003e2.1.1 Data collection\u003c/h2\u003e\n \u003cp\u003eWe collect a comprehensive dataset of facial images for training and testing the AI model. We draw our sample of facial images from Crunchbase, one of the premier databases used in contemporary entrepreneurship research [\u003cspan\u003e14\u003c/span\u003e, \u003cspan\u003e19\u003c/span\u003e, \u003cspan\u003e20\u003c/span\u003e]. Importantly, Crunchbase provides information on entrepreneurs and non-entrepreneurs (e.g., managers and other employees of entrepreneurial ventures, investors). Each individual has a profile page that contains demographic information and data on the individual\u0026rsquo;s employment and entrepreneurship history. Crunchbase also displays profile pictures (mostly facial images) that are publicly accessible. We retrieved our sample of facial images from Crunchbase in March 2019, considering individuals with a CB rank between 1 and 100,000. The CB rank is an internal identifier that Crunchbase uses to indicate prominent individuals. Among those individuals, we only considered individuals located in the US and with non-missing gender information. This process yielded an initial sample of 42,043 individuals. To identify entrepreneurs, we used information on the number of organizations that the individual had founded [\u003cspan\u003e21\u003c/span\u003e]. Hence, we characterized individuals who had founded at least one organization as entrepreneurs (n\u0026thinsp;=\u0026thinsp;25,071, 59.6%). Our initial sample was reduced to 40,728 when removing individuals without processable facial images in Crunchbase (e.g., placeholders, comic images). 81% (19%) of our facial images refer to male (female) individuals.\u003c/p\u003e\n \u003cp\u003eUsing facial images as a data source has only recently gained attention in entrepreneurship research. Specifically, entrepreneurship research has begun to use facial images to capture emotions [\u003cspan\u003e22\u003c/span\u003e, \u003cspan\u003e23\u003c/span\u003e] and attributes such as attractiveness, competence, and intelligence [\u003cspan\u003e24\u003c/span\u003e]. Additionally, a recent study analyzes indicators of facial geometry (i.e., facial width-to-height ratio, cheekbone prominence, facial symmetry) and facial appearance to predict whether an individual emerges as an entrepreneur and entrepreneurial success [\u003cspan\u003e25\u003c/span\u003e].\u003c/p\u003e\n \u003c/div\u003e\n \u003cdiv id=\"Sec5\"\u003e\n \u003ch2\u003e2.1.2 Data pre-processing\u003c/h2\u003e\n \u003cp\u003eWe pre-process the raw data to filter out irrelevant information and clean the data. Specifically, we resize all facial images to a uniform size of 224\u0026times;224 pixels, which is a standard input size used in many machine learning and face-related applications [\u003cspan\u003e26\u003c/span\u003e, \u003cspan\u003e27\u003c/span\u003e]. This resizing is also necessary to match the input size of the pre-trained feature extraction model that we later use.\u003c/p\u003e\n \u003cp\u003eMoreover, facial images can contain additional information apart from the individual\u0026rsquo;s face (e.g., background, other body parts). Therefore, we leverage a face detector [\u003cspan\u003e28\u003c/span\u003e] to detect the face region in each image. The face detector outputs the coordinates of the face in the image, which can then be used to crop each facial image so that it only contains the face region (for more technical information, see Supplementary Information: Appendix A).\u003c/p\u003e\n \u003c/div\u003e\n \u003cdiv id=\"Sec6\"\u003e\n \u003ch2\u003e2.1.3 Feature extraction\u003c/h2\u003e\n \u003cp\u003eWe then extract meaningful representations (i.e., features) from the high-dimensional raw input data, which is then used by the AI model to distinguish entrepreneurs and non-entrepreneurs. As our facial feature extractor, we use a VGG-Face 2 model [\u003cspan\u003e26\u003c/span\u003e], which is a prominent deep learning model that was pre-trained on a large dataset of facial images.\u003c/p\u003e\n \u003c/div\u003e\n \u003cdiv id=\"Sec7\"\u003e\n \u003ch2\u003e2.1.4 Classifier design and training\u003c/h2\u003e\n \u003cp\u003eWe adopt a Convolutional Neural Network (CNN) architecture as our classifier. The objective of the CNN is to further refine the features that the pre-trained feature extractor has extracted and to identify the task-specific features. This network first analyses small parts of the input image, recognizing common patterns such as edges or distinct colors in the input. These features are hierarchically combined and build a spatial hierarchy of features. This is performed using pooling layers in CNNs which downsample the located features, reducing spatial dimensions while preserving important features. Therefore, CNNs can form complex shapes and object parts from simple edges and textures. Due to this property, CNNs are highly effective in extracting salient characteristics from images, which is why we utilize a CNN as our backbone network for feature extraction.\u003c/p\u003e\n \u003cp\u003eIn the proposed AI model, we need to compare two facial images. We leverage a shared CNN backbone to extract features to compare the two faces, one feature representation per image. The shared CNN backbone allows comparing the two images with respect to the common features that the CNN backbone has learned. After comparing the features of both inputs, our AI model compares the similarities and differences of the two faces. Figure\u0026nbsp;\u003cspan\u003e1\u003c/span\u003e provides an overview of this CNN-based classification model.\u003c/p\u003e\n \u003cp\u003e\u003cem\u003e- Please insert\u003c/em\u003e Fig.\u0026nbsp;\u003cspan\u003e1\u003c/span\u003e \u003cem\u003eabout here -\u003c/em\u003e\u003c/p\u003e\n \u003cp\u003eIn the subsequent classifier training stage, the AI model learns to automatically identify and extract task-specific informative features from this higher dimensional input space. Deep machine learning models are capable of learning hierarchical representations of features through multiple layers of abstraction. Low-level features like edges or textures are identified in the lower levels of the hierarchy. These features are then combined in the top layers of the hierarchy to form abstract features like object shapes.\u003c/p\u003e\n \u003cp\u003eDeep learning models utilize trainable weights which are tuned during the training process to locate informative features. Each layer within the hierarchical structure of the deep machine learning model has several thousand of such weights which need to be optimized. Tuning these from scratch requires a large amount of training samples which is usually hard to obtain. As a solution, transfer learning approaches have emerged where pre-trained models that have been trained for a different but related task are leveraged for feature extraction. For example, face analysis tasks can leverage pre-trained face recognition models which have been extensively trained on large-scale datasets for detecting distinct facial characteristics. These models have learned from vast amounts of data during pre-training and can extract comprehensive and richer feature representation. We fine-tune our classification model to identify relevant task-specific features from the set of features that the pre-trained model extracts. Therefore, transfer learning reduces the need for an extensive collection of large-scale datasets (for more information related to transfer learning technology, see [\u003cspan\u003e29\u003c/span\u003e]).\u003c/p\u003e\n \u003cp\u003eDuring the training, we provide pairs of facial images to our model. Each pair of images comprises one entrepreneur and one non-entrepreneur. We identify the samples within the pair of inputs as left-image and right-image. When generating the face image pairs, we ensure that both pairs belong to either male or female participants. We randomly alter the position that the entrepreneur\u0026rsquo;s face appears within the pair. As such it can appear either as the left input image or in the right input image. Our model outputs 0 (zero) if the left image is an entrepreneur and 1 (one) if the right image is an entrepreneur. Through the learning process, the model learns which features are important for making accurate comparisons, and our model can identify which image is of an entrepreneur.\u003c/p\u003e\n \u003cp\u003eUsing the entrepreneur and non-entrepreneur facial images in our dataset, we randomly pick pairs ensuring that each pair contains an entrepreneur and a non-entrepreneur. We follow the standard training and testing evaluation protocol in machine learning and randomly select 75% of these pairs for model training and the remaining 25% for model testing. A subset (10%) of the training data is held out as a validation set to evaluate the model\u0026rsquo;s performance on unseen data. Good performance on the validation set indicates good generalization of the model without overfitting. We then randomly initiate the weights of the model 10 times. In each repetition, the model achieves a different convergence point, reflecting our accuracy scores on the test data. The training and validation accuracy curves of the proposed model are provided in Figure SI1 (Supplementary Information: Appendix A). The curves show that there is no major divergence between the training and validation accuracies, which indicates that the model is not overfitting.\u003c/p\u003e\n \u003c/div\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec8\"\u003e\n \u003ch2\u003e2.2 Experiment with human entrepreneurship experts\u003c/h2\u003e\n \u003cp\u003eWe compare the accuracy of our AI model with human entrepreneurship experts to assess the beyond-human capacity of the AI model. Therefore, we designed a survey-based online experiment that mirrors the classification task that the AI model performed. The main part of the survey shows a consecutive set of pairs of facial images to participants (one entrepreneur and one non-entrepreneur per pair). Participants were asked to indicate which of the two individuals they think is the entrepreneur. The subsample of facial images that we used to construct the pairs of images used in our online experiment comprises approximately 2,150 male and 500 female individuals. These images are randomly selected from our test set (which comprises 25% of our total sample of Crunchbase images). Hence, these images were not used to train the model.\u003c/p\u003e\n \u003cp\u003eIn the online experiment, each participant was confronted with in total 10 pairs of facial images. Hence, if a participant correctly identified the entrepreneur in 10 out of 10 pairs of facial images, their accuracy was 100%, while random guessing across the 10 pairs should result in an accuracy of around 50% across respondents. Because our sample contains some famous entrepreneurs and non-entrepreneurs, we also included a question asking respondents to tick a respective box if they recognize any persons shown in the facial image pair. When calculating the accuracy scores, we removed all decisions in which the respondents indicated to know one of the individuals depicted.\u003c/p\u003e\n \u003cp\u003eAfter the classification tasks, we collected information on participants\u0026rsquo; age, country/region of origin, gender, the highest degree of education, the main field of education, and their main type of work experience (aside from potential investing activities). We also captured participants\u0026rsquo; expertise by asking them to indicate which expert category they best fit in. Response options included (a) full-time entrepreneur, (b) part-time entrepreneur, (c) professional investor (i.e., venture capitalist or business angel), (d) other employees, (e) entrepreneurship researcher, (f) entrepreneurship educator, (g) student, (h) prefer not to say, (i) none of the above. Because we are interested in the performance of entrepreneurship experts, we only keep participants that self-assign to (a), (b), (c), (e), or (f).\u003c/p\u003e\n \u003cp\u003eWe solicited participants in two ways: First, we used Prolific (\u003cspan\u003e\u003cspan\u003ewww.prolific.com\u003c/span\u003e\u003c/span\u003e) to recruit a sample of entrepreneurs. Second, we recruited participants by distributing the survey in the authors\u0026rsquo; professional networks and via social media (addressing entrepreneurship circles). In total, we were able to collect responses from 650 human (self-assigned) entrepreneurship experts who made a total of 6,500 decisions. Because we remove all decisions in which the respondents indicated to know one of the individuals depicted, our analyses consider 6,431 out of 6,500.\u003c/p\u003e\n \u003cp\u003eTable\u0026nbsp;\u003cspan\u003e1\u003c/span\u003e displays a breakdown of our human entrepreneurship experts according to expert category, origin, age, gender, highest education degree, and main field of education.\u003c/p\u003e\n \u003cdiv\u003e\n \u003ctable id=\"Tab1\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv\u003eTable 1\u003c/div\u003e\n \u003cdiv\u003e\n \u003cp\u003eBackground information about the human entrepreneurship experts (n\u0026thinsp;=\u0026thinsp;650) who participated in our online experiment.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003ccolgroup cols=\"4\"\u003e\u003c/colgroup\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eItem\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eCategory\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eCounts\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003ePercent\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eExpert group\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEntrepreneur\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e384\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e59.08\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEntrepreneurship educator\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e92\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e14.15\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEntrepreneurship researcher\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e143\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e22.00\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eVenture capitalist/business angel\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e31\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4.77\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eRegion\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAustralia/Asia-Pacific\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e145\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e22.31\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEurope\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e120\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e18.46\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eMiddle East/North Africa\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e10\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.54\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSouth America\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e11\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.69\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eUSA/Canada\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e355\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e54.62\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNo response\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e9\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1.38\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eAge\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e24 or younger\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e40\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e6.15\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e25 to 29\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e71\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e10.92\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e30 to 39\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e195\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e30.00\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e40 to 44\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e161\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e24.77\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e50 to 54\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e123\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e18.92\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e60 or older\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e59\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e9.08\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNo response\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e1\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.15\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eGender\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eFemale\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e244\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e37.54\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eMale\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e375\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e57.69\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eOther\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e3\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.46\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNo response\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e28\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4.31\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eHighest degree\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eBelow high school degree\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e5\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.77\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eHigh school degree or equivalent\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e87\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e13.38\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eBachelor degree\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e174\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e26.77\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eMBA\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e45\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e6.92\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eOther Master degree/postgraduate\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e137\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e21.08\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003ePhD/doctoral degree\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e182\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e28.00\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003ePrefer not to say/no response\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e20\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e3.08\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eField of education\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eBusiness or economics\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e274\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e42.15\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eHumanities\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e31\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4.77\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eLaw\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e18\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e2.77\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSTEM\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e159\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e24.46\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSocial sciences\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e76\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e11.69\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eOther\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e73\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e11.23\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNo response\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.62\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n \u003c/div\u003e\n \u003cp\u003e\u003cem\u003e- Please insert\u003c/em\u003e Table\u0026nbsp;\u003cspan\u003e1\u003c/span\u003e \u003cem\u003eabout here -\u003c/em\u003e\u003c/p\u003e\n\u003c/div\u003e"},{"header":"3. Results","content":"\u003cdiv id=\"Sec10\"\u003e\n \u003ch2\u003e3.1 AI model\u0026rsquo;s accuracy\u003c/h2\u003e\n \u003cp\u003eTo assess the performance of the AI model in distinguishing entrepreneurs from non-entrepreneurs, we evaluate the accuracy as\u003c/p\u003e\n \u003cdiv id=\"Equa\"\u003e\n \u003cdiv id=\"FileID_Equa\" name=\"EquationSource\"\u003e$$\\:Accuracy=\\:\\frac{(TP+TN)}{(TP+FP+TN+FN)}\\times\\:100$$\u003c/div\u003e\n \u003c/div\u003e\n \u003cp\u003ewhere TP represents the count of correctly identified entrepreneurs (true positive), TN denotes the count of correctly identified non-entrepreneurs (true negative), FP represents the count of non-entrepreneurs identified as entrepreneurs (false positive) and FN denotes the count of entrepreneurs identified as non-entrepreneurs (false negative).\u003c/p\u003e\n \u003cp\u003eThe average accuracy of the AI model is obtained from randomly initializing the internal weights of the AI model and training the model 10 times. The accuracy of the model, when it converges, is taken as the accuracy of that trial. The accuracies obtained in the 10 trials are [80.30, 78.76, 79.28, 77.89, 79.98, 79.53, 79.52, 80.39, 79.20, 80.24], which yield an average accuracy of 79.51 (SD\u0026thinsp;=\u0026thinsp;0.78). This suggests that when presented with a pair of images, our AI model can identify the entrepreneur with an accuracy of 79.51%, which is our main result. This accuracy is well above random guessing (i.e., 50%), indicating that our AI model is indeed able to identify systematic differences that distinguish the facial images of entrepreneurs from those of non-entrepreneurs.\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec11\"\u003e\n \u003ch2\u003e3.2 Accuracy achieved by human entrepreneurship experts\u003c/h2\u003e\n \u003cp\u003eWe then compare the AI model\u0026rsquo;s accuracy to the accuracy that human entrepreneurship experts would achieve on the same task, as captured via our online experiment. The experts we selected have expertise regarding entrepreneurs, making them the best human comparison group for our AI vs. humans test.\u003c/p\u003e\n \u003cp\u003eTable\u0026nbsp;\u003cspan\u003e2\u003c/span\u003e shows that the mean accuracy across all subgroups of human entrepreneurship experts is 49.42% (SD\u0026thinsp;=\u0026thinsp;15.93). The accuracy is highest among entrepreneurship researchers (51.24%), while it is lowest among professional venture capitalists and business angels (43.87%). However, these differences are not very pronounced, so that the mean accuracy of the human judges is relatively homogeneous around or slightly below the mean value of 50%. Because each respondent was shown a set of pairs of facial images comprising one entrepreneur and one non-entrepreneur, this is equivalent to a random guess, indicating that human judges cannot systematically distinguish entrepreneurs from non-entrepreneurs. The result of the t-test in Table\u0026nbsp;\u003cspan\u003e2\u003c/span\u003e documents that our human entrepreneurship experts achieve significantly lower accuracies than the AI model (p\u0026thinsp;\u0026lt;\u0026thinsp;0.01), indicating that the AI model\u0026rsquo;s performance is indeed \u0026ldquo;beyond-human\u0026rdquo;.\u003c/p\u003e\n \u003cdiv\u003e\u0026nbsp;\u003ctable id=\"Tab2\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv\u003eTable 2\u003c/div\u003e\n \u003cdiv\u003e\n \u003cp\u003eMain results: mean accuracy of selected subgroups within our human judges and comparison with the performance of our AI model.\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eModel/sample\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003en\u003c/p\u003e\n \u003cp\u003e(classifications)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eMean accuracy\u003c/p\u003e\n \u003cp\u003e(SD)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003et-test\u003c/p\u003e\n \u003cp\u003eHuman experts vs. AI model (p-value)\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eAI model\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e10 (-)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e79.51 (0.78)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eHuman experts\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e650 (6,431\u003csup\u003ea\u003c/sup\u003e)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e49.42 (15.93)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e5.92 (0.00)\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEntrepreneur\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e384 (3,791)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e50.27 (15.66)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e5.90 (0.00)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEntrepreneurship educator\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e92 (911)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e47.74 (17.35)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e5.76 (0.00)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEntrepreneurship researcher\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e143 (1,419)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e51.24 (15.42)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e5.78 (0.00)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eVenture capitalist/business angel\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e31 (310)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e43.87 (14.30)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e7.81 (0.00)\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003eTrained humans\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cstrong\u003e133 (1,273)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e\u003cstrong\u003e48.12 (17.99)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e\u003cstrong\u003e5.50 (0.00)\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003c/table\u003e\n \u003c/div\u003e\n \u003cp\u003e\u003cem\u003eNotes:\u0026nbsp;\u003c/em\u003eThe human experts (n=650) did not undergo any specific training. The \u0026ldquo;trained humans\u0026rdquo; (n=133) were exposed to a brief training before participating in our online experiment. The t-test confirms a significant difference between the performance of human judges and AI model (p \u0026lt; 0.01). \u003csup\u003ea\u003c/sup\u003e = each classification refers to a human expert being shown a pair of facial images and indicating who they think is an entrepreneur. While every respondent performed 10 classifications, the number of classifications that we use to calculate the accuracies is slightly lower than 6,500 (=650 participants making 10 classifications each) because we remove those classifications in which respondents indicated that they know one of the facial images (i.e., recognized the entrepreneur or non-entrepreneur).\u003cem\u003e\u0026nbsp;\u003c/em\u003e\u003c/p\u003e\n \u003cp\u003e\u003cem\u003e- Please insert\u003c/em\u003e Table \u003cspan\u003e2\u003c/span\u003e \u003cem\u003eabout here -\u003c/em\u003e\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec12\"\u003e\n \u003ch2\u003e3.3 Further analyses and robustness checks\u003c/h2\u003e\n \u003cp\u003eTo shed some light on the functioning of the AI model, our findings\u0026rsquo; robustness, and their validity, we perform a range of further analyses and robustness checks. We briefly summarize these analyses below and report more technical details in Supplementary Information: Appendix A.\u003c/p\u003e\n \u003cdiv id=\"Sec13\"\u003e\n \u003ch2\u003e3.3.1 Exploring the AI model\u0026rsquo;s decisions\u003c/h2\u003e\n \u003cp\u003eWe observed considerable variability in the visual information that the AI model picked up and used to classify entrepreneurs and non-entrepreneurs. To illustrate, in Fig.\u0026nbsp;\u003cspan\u003e2\u003c/span\u003e we create heatmaps for two pairs of entrepreneurs and non-entrepreneurs (pair 1: image a) and b), pair 2: image c) and d)) that highlight areas in the input images that are critical for the AI models\u0026rsquo; decision-making. Each heatmap shows the top 50 sub-regions that contribute to the model decision. In addition, the sub-region boundaries are indicated in yellow.\u003c/p\u003e\n \u003cp\u003eConsidering this variability among the selected sub-regions by the AI model, we conducted a systematic analysis using generally accepted central facial landmarks (nose, eyes, mouth). In this analysis, we input only a single facial landmark or a combination of them, which reveals the most significant landmarks for the classification of entrepreneurs and non-entrepreneurs (for more technical information, see Appendix A2). The results in Fig.\u0026nbsp;\u003cspan\u003e3\u003c/span\u003e indicate that the highest accuracy stems from the visual information associated with the nose region. Also, the accuracy improves when the visual information associated with facial landmarks is considered jointly. However, the combined model does not outperform our main model described in Section \u003cspan\u003e3.1\u003c/span\u003e, which uses the entire facial information as input. A potential explanation is that our main model possesses the capacity to also oversee different facial attributes such as skin textures, in addition to the important landmarks, and systematically attends to these salient attributes, learning complex non-linear relationships among these.\u003c/p\u003e\n \u003cp\u003e\u003cem\u003e- Please insert\u003c/em\u003e Figs.\u0026nbsp;\u003cspan\u003e2\u003c/span\u003e and \u003cspan\u003e3\u003c/span\u003e \u003cem\u003eabout here -\u003c/em\u003e\u003c/p\u003e\n \u003c/div\u003e\n \u003cdiv id=\"Sec14\"\u003e\n \u003ch2\u003e3.3.2 Model bias: Gender and race\u003c/h2\u003e\n \u003cp\u003eBecause AI models are prone to biases, we explicitly analyze the bias and sensitivity of the trained model towards gender and race.\u003c/p\u003e\n \u003cp\u003eGiven that most of our training data is from male individuals (81%), we assess whether the model achieves a higher (or lower) accuracy when evaluating male vs. female images. Assessing the gender-wise accuracy of the AI model on the test set, the AI model achieves an accuracy of 78.5% for male images, and 83.1% for female images. This indicates that the trained model performs equally regarding identifying male and female entrepreneurs. Going further, to better understand what features of the face region are extracted by the face classifier and to understand whether these identified features have any gender bias, we generate an embedding space visualization for a set of randomly chosen samples (Appendix A3). This analysis shows that the AI model\u0026rsquo;s learned embedding space separation is based on the ENT/Non-ENT labels and does not seem to be biased towards a specific gender.\u003c/p\u003e\n \u003cp\u003eRacial bias is another area of concern, given that most individuals in our sample can be categorized as white. This implies that our AI model could be biased and perform differently for facial images that refer to non-white individuals. Because information on racial background is not included in Crunchbase, two authors manually inspected all the ~\u0026thinsp;2,600 facial images used in our online experiment. Both researchers were tasked to independently identify and remove all facial images that depicted individuals that they would classify as white, leaving only facial images of individuals with racial backgrounds other than white. We then separately tested the trained model using these subsets of non-white images. It should be noted that these facial images have not been used for model training and were part of our testing set. The resulting accuracies for researcher 1 are 78.24% for male individuals (n\u0026thinsp;=\u0026thinsp;489) and 80.75% for female individuals (n\u0026thinsp;=\u0026thinsp;140). Similarly, the accuracies for researcher 2 are 77.90% for male individuals (n\u0026thinsp;=\u0026thinsp;596) and 81.84% for female individuals (n\u0026thinsp;=\u0026thinsp;176). These results align with our main results, indicating that the AI model does not seem to be heavily biased towards a certain racial background of the individuals in our dataset.\u003c/p\u003e\n \u003c/div\u003e\n \u003cdiv id=\"Sec15\"\u003e\n \u003ch2\u003e3.3.3 Situational factors\u003c/h2\u003e\n \u003cp\u003eAnother major caveat is that our AI model could base its predictions on situational factors (see also [\u003cspan\u003e5\u003c/span\u003e]). Potential explanations could be that entrepreneurs use different head poses, deliberately employ certain facial expressions (e.g., smiling as impression management), have more professional photographs, or have more professional make-up or lighting than non-entrepreneurs. So far, our additional analyses (e.g., visual information associated with facial landmarks) indicate that the high accuracy of our model is likely due to facial morphology and potentially not due to situational factors (such as facial expression or head pose in the facial image), but this interpretation might be premature.\u003c/p\u003e\n \u003cp\u003eTo address this point in more detail, we randomly selected 10 entrepreneurs in our test set and artificially altered their expression, gaze, and emotions using the Hey-Photo (\u003cspan\u003e\u003cspan\u003ehttps://hey-photo.com\u003c/span\u003e\u003c/span\u003e) online editor which uses generative AI technology to alter the person\u0026rsquo;s expression, smile, and gaze in a given image. After altering the faces, we tested our model using the new 10 images and compared the performance difference in our model for bona fide and synthesized images. When considering the average change in model confidence in identifying the entrepreneur we observed only a 4.26% change from its original confidence level. As such, this provides some indication that our model is not biased towards the facial expressions and emotions of a given subject.\u003c/p\u003e\n \u003c/div\u003e\n \u003cdiv id=\"Sec16\"\u003e\n \u003ch2\u003e3.3.4 Public figures\u003c/h2\u003e\n \u003cp\u003eTo provide a more illustrative example of the power of the AI model, we also present accuracy results for public figures (e.g., famous entrepreneurs). Note that we first defined a group of interesting cases, and then generated and reported the results for these cases. Hence, we did not engage in any sort of selective reporting (where one would only present particularly impressive results while omitting other tests and results [\u003cspan\u003e30\u003c/span\u003e]).\u003c/p\u003e\n \u003cp\u003eFirst, we start with evaluating facial images of one of the most famous entrepreneurs currently active, Elon Musk. As illustrated in Fig.\u0026nbsp;\u003cspan\u003e4\u003c/span\u003e (panel a), our model classifies Elon Musk as an entrepreneur with a probability of 98.8%, suggesting that the AI model is highly confident in its prediction. We repeat this for a selection of facial images of other famous entrepreneurs (Fig.\u0026nbsp;\u003cspan\u003e5\u003c/span\u003e), with similar results. In addition, in Fig.\u0026nbsp;\u003cspan\u003e4\u003c/span\u003e, panels (b) to (d), we also analyze different images of Elon Musk, in which he shows different emotions/facial expressions and head poses than in the first facial image (which might indeed be interpreted as a very confident/optimistic look/expression). The accuracy results across the panels are almost identical, indicating again that the model is not swayed by situational factors (e.g., smiling or head posture) in a major way. We also do this for a selection of facial images of famous entrepreneurs shown in Fig.\u0026nbsp;\u003cspan\u003e5\u003c/span\u003e (the modified facial images can be requested from the authors). Again, we observe that there are only minor fluctuations in the accuracy level, compared to the accuracy result for the original (real) facial images.\u003c/p\u003e\n \u003cp\u003eFinally, given the recent discussions on entrepreneurial personalities in political leadership [\u003cspan\u003e31\u003c/span\u003e, \u003cspan\u003e32\u003c/span\u003e, \u003cspan\u003e33\u003c/span\u003e] and the relevance of individual differences in the political context [\u003cspan\u003e5\u003c/span\u003e, \u003cspan\u003e34\u003c/span\u003e, \u003cspan\u003e35\u003c/span\u003e], we conclude these additional analyses by examining facial images from a selection of political leaders (Fig.\u0026nbsp;\u003cspan\u003e6\u003c/span\u003e). With a high probability, the AI model (correctly) identifies the single politician among various political leaders that has a notable career as an entrepreneur.\u003c/p\u003e\n \u003cp\u003eHence, we report additional anecdotal evidence that the AI model identifies entrepreneurial individuals with high probability across these assessments of facial images from public figures (i.e., famous entrepreneurs and politicians).\u003c/p\u003e\n \u003cp\u003e\u003cem\u003e- Please insert\u003c/em\u003e Figs.\u0026nbsp;\u003cspan\u003e4\u003c/span\u003e, \u003cspan\u003e5\u003c/span\u003e and \u003cspan\u003e6\u003c/span\u003e \u003cem\u003eabout here -\u003c/em\u003e\u003c/p\u003e\n \u003c/div\u003e\n \u003cdiv id=\"Sec17\"\u003e\n \u003ch2\u003e3.3.5 Authors of this study\u003c/h2\u003e\n \u003cp\u003eFinally, we took advantage of the diverse backgrounds of the author team in terms of entrepreneurial behavior. Two authors (the entrepreneurship scholars) had started their own businesses in the past, whereas the other two authors (the machine learning scholars) had not. The result is shown in Figure SI4 (Appendix A). Again, the AI model correctly assigns a high probability of entrepreneurship to the two authors with significant entrepreneurial tendencies in their occupational careers (own entrepreneurial behavior and entrepreneurship as the subject of their academic discipline), but not to those without such tendencies.\u003c/p\u003e\n \u003c/div\u003e\n \u003cdiv id=\"Sec18\"\u003e\n \u003ch2\u003e3.3.6 Testing trained humans\u003c/h2\u003e\n \u003cp\u003eWhile our AI model underwent an extensive training process (with data from the same dataset that was also used to test the accuracy of the AI model), the human experts were tested with data from the same dataset but did not undergo such training upfront. Hence, our results could be driven by this difference in training. There is at least the possibility that if human participants were given the chance to first investigate the facial images of entrepreneurs visually vs. non-entrepreneurs in this dataset, they would have been able to also spot systematic differences (making them trained humans). As a result, their performance in the classification experiment could improve.\u003c/p\u003e\n \u003cp\u003eTherefore, we devised a brief training program to \u0026lsquo;level the playing field\u0026rsquo;. Specifically, we extracted a random sample of image pairs from our online classification experiment (48 male pairs, 12 female pairs, in line with the gender distribution in our full sample of facial images retrieved from Crunchbase). We prepared a presentation (PowerPoint slides) in which we included 12 entrepreneurs and 12 non-entrepreneurs per slide (=\u0026thinsp;24 facial images per slide). These images were labeled with the group labels (\u0026lsquo;entrepreneur\u0026rsquo; or \u0026lsquo;non-entrepreneur\u0026rsquo;) so that training participants were able to compare the facial images of entrepreneurs and non-entrepreneurs. We used these training slides, which are included in the Supplementary Information: Appendix B, in an in-person classroom setting in entrepreneurship and business bachelor\u0026rsquo;s and master\u0026rsquo;s courses at the University of Amsterdam and the University of Luxembourg. Using the large screen in front of the classroom, we exposed students to the training material for approximately 10 minutes, asking them to fully concentrate on the images to examine and memorize any existing group differences. After exposing the students to the training material, we asked them to participate in our online classification experiment. As described in Table\u0026nbsp;\u003cspan\u003e2\u003c/span\u003e, we were able to collect responses from 133 individuals, making 1,273 classifications. The average accuracy is 48.12% (SD\u0026thinsp;=\u0026thinsp;17.99). Thus, the training did not significantly enhance the performance of human participants, and the AI model still outperformed the trained humans by a large margin.\u003c/p\u003e\n \u003cp\u003eIn Fig.\u0026nbsp;\u003cspan\u003e7\u003c/span\u003e, we summarize our study\u0026rsquo;s core results on the accuracy of the AI model versus human experts (main result), together with the trained experts (robustness check). We also provide context for the strength of the AI accuracy by comparing our results to other studies examining the accuracy of AI-supported face analysis in predicting other outcomes (e.g., political orientation).\u003c/p\u003e\n \u003cp\u003e\u003cem\u003e- Please insert\u003c/em\u003e Fig.\u0026nbsp;\u003cspan\u003e7\u003c/span\u003e \u003cem\u003eabout here -\u003c/em\u003e\u003c/p\u003e\n \u003c/div\u003e\n\u003c/div\u003e"},{"header":"4. Concluding remarks","content":"\u003cdiv id=\"Sec20\" class=\"Section2\"\u003e \u003ch2\u003e4.1 Discussion\u003c/h2\u003e \u003cp\u003eWhile the findings of any single study should be approached with caution, our research indicates that deep learning algorithms can discern occupational outcomes\u0026mdash;specifically, distinguishing between entrepreneurs and non-entrepreneurs\u0026mdash;from publicly available human-centric datasets like Crunchbase with substantial and above-chance accuracy (79.51%). Conversely, human raters did not exceed chance levels in a comparable task. This adds to our knowledge of the capabilities of AI in (a) extracting a whole range of private personal information from readily accessible human-centric data [\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e, \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e, \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e] and (b) outperforming humans, including experts, in such tasks. As highlighted by Kosinski [\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e], \u0026ldquo;one\u0026rsquo;s face is particularly difficult to hide in both interpersonal interactions and digital records,\u0026rdquo; making private information derived from facial images with substantial accuracy, including occupational details, a piece of sensitive information that can easily circulate within society, businesses, organizations, and among individuals. For example, entrepreneurial experience can have value as information because it has been linked to advantages in terms of access to entrepreneurial resources in (budding) entrepreneurs, including venture capital [\u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e36\u003c/span\u003e, \u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e37\u003c/span\u003e], increased venture survival [\u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e38\u003c/span\u003e], and public policy schemes [\u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e39\u003c/span\u003e, \u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e40\u003c/span\u003e].\u003c/p\u003e \u003cp\u003eWhile these are important research findings in themselves, it is also crucial to emphasize that the implications and application of such findings must be approached with utmost care. The potential for unintended consequences is high, and there are many pitfalls that one might not even be aware of. These pitfalls extend beyond social bias, such as stereotypes and discrimination, to include a range of societal risks. Misapplication of these findings can lead to significant ethical issues and negative impacts on society [\u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e6\u003c/span\u003e, \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e17\u003c/span\u003e, \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e18\u003c/span\u003e]. Therefore, our results should not be misinterpreted as broadly endorsing the widespread use of such AI methods, including facial recognition, for evaluating and classifying people. Testing these capabilities in our case does not imply reinforcement or recommendation in practice. The sole aim of this study was to assess AI\u0026rsquo;s capabilities compared to human performance within the limitations of the study design and potential social bias, which is a crucial consideration. For example, certain groups could be exposed to discrimination and stereotyping which could affect their likelihood of becoming an entrepreneur and hence representation in the dataset we used and the respective classification results. Social bias in the real world is well documented in a myriad of studies (e.g., in investment decisions affecting entrepreneurs [\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e] or the \u0026lsquo;what is beautiful is good\u0026rsquo; stereotype [\u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e41\u003c/span\u003e]; see also [\u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e25\u003c/span\u003e]). We also acknowledge the well-documented dangers of following any \u0026lsquo;illusions\u0026rsquo; of understanding AI-driven research results [\u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e42\u003c/span\u003e]. Hence, we highlight the findings as they are, focusing on the predictive capacity of AI in comparison to humans. However, it is essential to discuss them in the potential context of biases when considering the actual meanings and implications for society. We believe that our findings and this contextualized interpretation hinting at the potential of social bias add significant knowledge to the respective debates in society\u0026mdash;given the unquestionable disruptive potential of such AI methods and data in the real world, along with their major ethical implications affecting large parts of society.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec21\" class=\"Section2\"\u003e \u003ch2\u003e4.2 Limitations\u003c/h2\u003e \u003cp\u003eOur study has various limitations. First, although we ran several additional tests, we cannot say with absolute certainty what information the AI model picked up to distinguish entrepreneurs from non-entrepreneurs. For example, there is still a small chance, in our view, that facial expression or other situational factors could have played a role. However, as noted before, our main goal was not to identify the actual distinguishing features (e.g., what does an entrepreneurial face look like), but to test whether entrepreneurs are different at all and whether this can be reliably predicted in an AI vs. humans setting.\u003c/p\u003e \u003cp\u003eSecond, as with any empirical research project, the quality of the data that is used to train and test the AI model is critical. If the data is inaccurate or biased, the AI model will absorb these inconsistencies and generate flawed conclusions that may reproduce or amplify the biases present in the training data [\u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e43\u003c/span\u003e, \u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e44\u003c/span\u003e]. We use Crunchbase to retrieve and identify our sample of entrepreneurs and non-entrepreneurs. Crunchbase is one of the most ubiquitous databases used in contemporary entrepreneurship research due to its recent, accurate, and comprehensive coverage [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e, \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e, \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e20\u003c/span\u003e, \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e21\u003c/span\u003e]. Despite its coverage, we acknowledge that the founding history recorded in Crunchbase might not be completely accurate for every individual. For example, some individuals that we classify as non-entrepreneurs might have participated in founding a new venture that is too insignificant to be recorded in Crunchbase. Others might deliberately omit founding information from their profile to masquerade past failures. This leads to a situation in which we might falsely classify some individuals as non-entrepreneurs even though they are entrepreneurs. Future research could try to circumvent this potential limitation by collecting training data from different sources (e.g., via surveys) or by performing thorough background checks on the individuals included in Crunchbase to verify the accuracy of their classification as entrepreneurs or non-entrepreneurs.\u003c/p\u003e \u003cp\u003eWe also acknowledge that the data included in Crunchbase that we use in our analyses might not to be free of bias or representative of the entire population of entrepreneurs and non-entrepreneurs. Specifically, the individuals that we classify as non-entrepreneurs represent prominent business professionals (e.g., CEOs, managers, investors) so that our sample of non-entrepreneurs is not a cross-section of the general population. Instead, our assessment is closer to a comparison between entrepreneurs and managers, which is a popular comparison in entrepreneurship research [\u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e45\u003c/span\u003e, \u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e46\u003c/span\u003e]. Moreover, potential biases in our data could stem from the fact that Crunchbase focuses mostly on tech ventures [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e] in the US [\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e19\u003c/span\u003e]. This suggests that our sample might be skewed towards entrepreneurs and non-entrepreneurs in the US tech sector. While we acknowledge that this sample might not be representative of the entire population of entrepreneurs, we want to emphasize that such ventures are a particularly important source of economic growth and innovation [\u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e14\u003c/span\u003e], so that our analysis is still impactful and relevant, even when considering this narrower scope. To summarize, while Crunchbase is a state-of-the-art database in contemporary entrepreneurship research and we are not aware of an alternative data source that would allow us to improve our model (i.e., from a technical, legal, and ethical standpoint), the caveats that we acknowledge need to be considered when interpreting the results.\u003c/p\u003e \u003cp\u003eThird, our model is trained with one facial image per person. Using more images per person could change the effectiveness of the training. It might also make it possible to study the role of age. While our sample covers individuals across all age groups, we do not know how the algorithm behaves for facial images of the same individual over time. That is, there might be some bias in our predictions related to age.\u003c/p\u003e \u003cp\u003eFinally, in our analysis we attempted to address a potential bias regarding gender and racial background in the AI model but that cannot completely rule out any biases. Examples include potential gender identity, disabilities, or ancestry. Because we are unable to reliable infer these characteristics from the data available in Crunchbase, we cannot consider them in our assessment.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec22\" class=\"Section2\"\u003e \u003ch2\u003e4.3 Conclusion\u003c/h2\u003e \u003cp\u003e Together with a growing body of related findings, studies like ours show the potential of AI and underscore the need for robust ethical guidelines and regulatory frameworks to govern the use of AI and human-centric data. This includes extracting personal information from publicly available data to prevent misuse, protect individual privacy, and ensure broader ethical standards when using such AI methods with certain types of data and the underlying potential of social bias reflected in this data and the respective AI results. This also highlights the necessity for extreme caution regarding ethical risks in study designs, results, and their interpretation and application. As technology advances, it becomes imperative to balance its potential benefits with the deep ethical challenges it presents, ensuring that AI deployment respects individual privacy and aligns with societal values and ethical standards.\u003c/p\u003e "},{"header":"Declarations","content":"\u003ch1\u003eAuthor contributions statement\u003c/h1\u003e\n\u003cp\u003eMartin Obschonka, Christian Fisch, Tharindu Fernando, and Clinton Fookes designed the research; Clinton Fookes developed the deep machine learning research idea and provided supervision; Martin Obschonka, Christian Fisch, and Tharindu Fernando performed the research and analyzed the data; Martin Obschonka, Christian Fisch, and Tharindu Fernando wrote the manuscript. All authors discussed the results. All authors reviewed and approved the manuscript.\u003c/p\u003e\n\u003ch1\u003eCompeting interests statement\u003c/h1\u003e\n\u003cp\u003eThe authors declare no competing interests.\u003c/p\u003e\n\u003ch1\u003eData availability statement\u003c/h1\u003e\n\u003cp\u003eThe data and code used in this study contain sensitive information and cannot be publicly shared. However, the code and data (except for the facial images) may be shared by the corresponding author upon reasonable request.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eRanjan, R. et al. Deep learning for understanding faces: machines may be just as good, or better, than humans. IEEE Signal Processing Magazine 35, 66\u0026ndash;83 (2018).\u003c/li\u003e\n\u003cli\u003eMoreno-Armend\u0026aacute;riz, M. A., Mart\u0026iacute;nez, C. A. D., Calvo, H. \u0026amp; Moreno-Sotelo, M. Estimation of personality traits from portrait pictures using the five-factor model. IEEE Access 8, 201649\u0026ndash;201665 (2020).\u003c/li\u003e\n\u003cli\u003eWang, Y. \u0026amp; Kosinski, M. Deep neural networks are more accurate than humans at detecting sexual orientation from facial images. Journal of Personality and Social Psychology 114(2), 246\u0026ndash;257 (2018).\u003c/li\u003e\n\u003cli\u003eKosinski, M. Facial recognition technology can expose political orientation from naturalistic facial images. Scientific Reports 11(1), 100 (2021).\u003c/li\u003e\n\u003cli\u003eKosinski, M., Khambatta, P. \u0026amp; Wang, Y. Facial recognition technology and human raters can predict political orientation from images of expressionless faces even when controlling for demographics and self-presentation. American Psychologist (2024). https://doi.org/10.1037/amp0001295\u003c/li\u003e\n\u003cli\u003eLevine, M., Philpot, R., Nightingale, S. J. \u0026amp; Kordoni, A. Visual digital data, ethical challenges, and psychological science. American Psychologist 79(1), 109\u0026ndash;122 (2024).\u003c/li\u003e\n\u003cli\u003eSantow, E. Emerging from AI utopia. Science 368, 9\u0026ndash;9 (2020).\u003c/li\u003e\n\u003cli\u003eBosma, N. The Global Entrepreneurship Monitor (GEM) and its impact on entrepreneurship research. Foundations and Trends in Entrepreneurship 9(2), 143\u0026ndash;248 (2013).\u003c/li\u003e\n\u003cli\u003eMcClelland, D. C. The achieving society (Van Nostrand Rinehold, 1961)\u003c/li\u003e\n\u003cli\u003eHaltiwanger, J. Entrepreneurship in the twenty-first century. Small Business Economics 58(1), 27\u0026ndash;40 (2022).\u003c/li\u003e\n\u003cli\u003eSchumpeter, J. A. The theory of economic development (Harvard University Press, 1934).\u003c/li\u003e\n\u003cli\u003eBaum, J. R. \u0026amp; Locke, E. A. The relationship of entrepreneurial traits, skill, and motivation to subsequent venture growth. Journal of Applied Psychology 89(4), 587\u0026ndash;598 (2004).\u003c/li\u003e\n\u003cli\u003eBrooks, A. W., Huang, L., Kearney, S. W. \u0026amp; Murray, F. E. Investors prefer entrepreneurial ventures pitched by attractive men. Proceedings of the National Academy of Sciences 111(12), 4427\u0026ndash;4431 (2014).\u003c/li\u003e\n\u003cli\u003eFreiberg, B. \u0026amp; Matz, S. C. Founder personality and entrepreneurial outcomes: a large-scale field study of technology startups. Proceedings of the National Academy of Sciences 120(19), e2215829120 (2023).\u003c/li\u003e\n\u003cli\u003eLazear, E. P. Balanced skills and entrepreneurship. American Economic Review 94(2), 208\u0026ndash;211 (2004).\u003c/li\u003e\n\u003cli\u003eLindquist, M. J., Sol, J. \u0026amp; Van Praag, M. Why do entrepreneurial parents have entrepreneurial children? Journal of Labor Economics 33(2), 269\u0026ndash;296 (2015).\u003c/li\u003e\n\u003cli\u003eLanders, R. N. \u0026amp; Behrend, T. S. Auditing the AI auditors: a framework for evaluating fairness and bias in high stakes AI predictive models. American Psychologist 78(1), 36\u0026ndash;49 (2023).\u003c/li\u003e\n\u003cli\u003eMadan, S., Savani, K. \u0026amp; Johar, G. V. How you look is who you are: the appearance reveals character lay theory increases support for facial profiling. Journal of Personality and Social Psychology 123(6), 1223\u0026ndash;1242 (2022).\u003c/li\u003e\n\u003cli\u003eTer Wal, A. L., Alexy, O., Block, J. \u0026amp; Sandner, P. G. The best of both worlds: the benefits of open-specialized and closed-diverse syndication networks for new ventures\u0026rsquo; success. Administrative Science Quarterly 61(3), 393\u0026ndash;432 (2016).\u003c/li\u003e\n\u003cli\u003eYu, S. How do accelerators impact the performance of high-technology ventures? Management Science 66(2), 530\u0026ndash;552 (2020).\u003c/li\u003e\n\u003cli\u003eFisch, C. \u0026amp; Block, J. H. How does entrepreneurial failure change an entrepreneur\u0026rsquo;s digital identity? Evidence from Twitter data. Journal of Business Venturing 36(1), 106015 (2021).\u003c/li\u003e\n\u003cli\u003eMomtaz, P. P. CEO emotions and firm valuation in initial coin offerings: an artificial emotional intelligence approach. Strategic Management Journal 42(3), 558\u0026ndash;578 (2021).\u003c/li\u003e\n\u003cli\u003eWarnick, B. J., Davis, B. C., Allison, T. H. \u0026amp; Anglin, A. H. Express yourself: facial expression of happiness, anger, fear, and sadness in funding pitches. Journal of Business Venturing 36(4), 106109 (2021).\u003c/li\u003e\n\u003cli\u003eColombo, M. G., Fisch, C., Momtaz, P. P. \u0026amp; Vismara, S. The CEO beauty premium: founder CEO attractiveness and firm valuation in initial coin offerings. Strategic Entrepreneurship Journal 16(3), 491\u0026ndash;521 (2022).\u003c/li\u003e\n\u003cli\u003eStefanidis, D., Nicolaou, N., Charitonos, S. P., Pallis, G. \u0026amp; Dikaiakos, M. What\u0026rsquo;s in a face? Facial appearance associated with emergence but not success in entrepreneurship. The Leadership Quarterly 33(2), 101597 (2022).\u003c/li\u003e\n\u003cli\u003eCao, Q., Shen, L., Xie, W., Parkhi, O. M. \u0026amp; Zisserman, A. VGGFace2: a dataset for recognizing faces across pose and age. 2018 13th IEEE International Conference on Automatic Face \u0026amp; Gesture Recognition (FG 2018), 67\u0026ndash;74 (2018).\u003c/li\u003e\n\u003cli\u003eRassadin, A., Gruzdev, A. \u0026amp; Savchenko, A. Group-level emotion recognition using transfer learning from face identification. Proceedings of the 19th ACM International Conference on Multimodal Interaction, 544\u0026ndash;548 (2017).\u003c/li\u003e\n\u003cli\u003eZhang, K., Zhang, Z., Li, Z. \u0026amp; Qiao, Y. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters, 23(10), 1499\u0026ndash;1503 (2016).\u003c/li\u003e\n\u003cli\u003eAcevez-Fernandez, M. A. Advances and applications in deep learning (IntechOpen, 2020).\u003c/li\u003e\n\u003cli\u003eBruns, S. B., Deressa, T. K., Stanley, T. D., Doucouliagos, C. \u0026amp; Ioannidis, J. P. Estimating the extent of selective reporting: an application to economics. Research Synthesis Methods 15(4), 590\u0026ndash;602 (2024).\u003c/li\u003e\n\u003cli\u003eLi, H., Meng, L. \u0026amp; Zhang, J. Why do entrepreneurs enter politics? Evidence from China. Economic Inquiry 44(3), 559\u0026ndash;578 (2006).\u003c/li\u003e\n\u003cli\u003eNystr\u0026ouml;m, K. Entrepreneurial politicians. Small Business Economics 41(1), 41\u0026ndash;54 (2013).\u003c/li\u003e\n\u003cli\u003eObschonka, M. \u0026amp; Fisch, C. Entrepreneurial personalities in political leadership. Small Business Economics 50, 851\u0026ndash;869 (2018).\u003c/li\u003e\n\u003cli\u003eAichholzer, J. \u0026amp; Willmann, J. Desired personality traits in politicians: similar to me but more of a leader. Journal of Research in Personality 88, 103990 (2020).\u003c/li\u003e\n\u003cli\u003eSchoen, H. \u0026amp; Schumann, S. Personality traits, partisan attitudes, and voting behavior. Evidence from Germany. Political Psychology 28(4), 471\u0026ndash;498 (2007).\u003c/li\u003e\n\u003cli\u003eHsu, D. H. Experienced entrepreneurial founders, organizational capital, and venture capital funding. Research Policy 36(5), 722\u0026ndash;741 (2007).\u003c/li\u003e\n\u003cli\u003eZhang, J. The advantage of experienced start-up founders in venture capital acquisition: evidence from serial entrepreneurs. Small Business Economics 36, 187\u0026ndash;208 (2011).\u003c/li\u003e\n\u003cli\u003ePaik, Y. Serial entrepreneurs and venture survival: evidence from US venture‐capital‐financed semiconductor firms. Strategic Entrepreneurship Journal 8(3), 254\u0026ndash;268 (2014).\u003c/li\u003e\n\u003cli\u003eBaum, J. A. \u0026amp; Silverman, B. S. Picking winners or building them? Alliance, intellectual, and human capital as selection criteria in venture financing and performance of biotechnology startups. Journal of Business Venturing 19(3), 411\u0026ndash;436 (2004).\u003c/li\u003e\n\u003cli\u003eShane, S. Why encouraging more people to become entrepreneurs is bad public policy. Small Business Economics 33, 141\u0026ndash;149 (2009).\u003c/li\u003e\n\u003cli\u003eDion, K., Berscheid, E. \u0026amp; Walster, E. What is beautiful is good. Journal of Personality and Social Psychology 24(3), 285\u0026ndash;290 (1972).\u003c/li\u003e\n\u003cli\u003eMesseri, L. \u0026amp; Crockett, M. J. Artificial intelligence and illusions of understanding in scientific research. Nature 627, 49\u0026ndash;58 (2024).\u003c/li\u003e\n\u003cli\u003eBarocas, S. \u0026amp; Selbst, A. D. Big data\u0026rsquo;s disparate impact. California Law Review 104(3), 671\u0026ndash;732 (2016).\u003c/li\u003e\n\u003cli\u003eManyika, J., Silberg, J. \u0026amp; Presten, B. What do we do about the biases in Al. Available at: https://hbr.org/2019/10/what-do-we-do-about-the-biases-in-ai (2019)\u003c/li\u003e\n\u003cli\u003eBrandst\u0026auml;tter, H. Personality aspects of entrepreneurship: a look at five meta-analyses. Personality and Individual Differences 51(3), 222\u0026ndash;230 (2011).\u003c/li\u003e\n\u003cli\u003eStewart, Jr., W. H. \u0026amp; Roth, P. L. Risk propensity differences between entrepreneurs and managers: a meta-analytic review. Journal of Applied Psychology 86(1), 145\u0026ndash;153 (2001).\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Artificial intelligence, AI, facial recognition technology, deep learning, entrepreneur, entrepreneurship, Convolutional Neural Network (CNN)","lastPublishedDoi":"10.21203/rs.3.rs-4926308/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4926308/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eOccupational outcomes like entrepreneurship are generally considered personal information that individuals should have the autonomy to disclose. With the advancing capability of artificial intelligence (AI) to infer private details from widely available human-centric data, such as social media, it is crucial to investigate whether AI can accurately extract private occupational information from such data. In this study, we demonstrate that deep neural networks can classify individuals as entrepreneurs based on a single facial image with high accuracy in data sourced from Crunchbase, a premier source for entrepreneurship data. Utilizing a dataset comprising facial images of 40,728 individuals, including both entrepreneurs and non-entrepreneurs, we trained a Convolutional Neural Network (CNN) and evaluated its classification performance. While human experts (n\u0026thinsp;=\u0026thinsp;650) and trained participants (n\u0026thinsp;=\u0026thinsp;133) were unable to classify entrepreneurs with accuracy above chance levels (\u0026gt;\u0026thinsp;50%), the AI model achieved a classification accuracy of 79.51%. Several robustness tests show that this high level of accuracy is maintained under various conditions.\u003c/p\u003e","manuscriptTitle":"AI and Entrepreneurship: Facial Recognition Technology Detects Entrepreneurs, Outperforming Human Experts","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-10-14 14:23:25","doi":"10.21203/rs.3.rs-4926308/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"c4a33ff0-9d64-4145-bb40-2e6afd9056d9","owner":[],"postedDate":"October 14th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-10-14T14:23:47+00:00","versionOfRecord":[],"versionCreatedAt":"2024-10-14 14:23:25","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4926308","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4926308","identity":"rs-4926308","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00