Utilizing Publicly Accessible Machine Learning for Training Student Formulators in Personal Care Product Formulation: Specification-Driven and Cost-Conscious Experiments | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Utilizing Publicly Accessible Machine Learning for Training Student Formulators in Personal Care Product Formulation: Specification-Driven and Cost-Conscious Experiments McKinnley Bilbao, Tomas Carmona, Morgan Covarrubias, Alex Goslin, and 7 more This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-3867796/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract This work explores the application of the questionable use of machine learning (ML), specifically the ChatGPT 3.5 system, in the training of student formulators. Here, an experiment is undertaken to explore the ability of ML to aid in training students in the role of formulators of a personal care product. The focus is on whether or not the students can successfully rely on ML to guide them through the formulation process of a 10-minute hydrating face mask recipe. While exploring the iterative process of recipe adjustments with ML, it was found that the language model demonstrates the ability to help formulators in training due to its above-average knowledge in chemistry, but when given clear prompts, it performs much better at providing helpful suggestions for ingredient substitutions. However, ML lacks a reliable memory, even within a single extended conversation and struggles with mathematical calculations. ML is not found to be proficient in accurately calculating cost adjustments. Additionally, the contribution of ML may only be marginally helpful in the training of more seasoned formulator. Despite its limitations, ML can quickly and effectively, in the hands of student formulators in-training, provide direction and support to produce and improve upon a base formula resulting in a quality product. Machine Learning ChatGPT Cosmetics Formulation novice formulators face mask Figures Figure 1 Figure 2 Introduction The landscape of personal care product development is a complex and multifaceted one, where the interplay of several factors, such as ingredient choices, 2 manufacturing processes, and consumer demands coalesce. Formulation of these products demands a careful balance of ingredients to ensure desired product efficacy, safety, and appeal to consumers. With rising costs and an increasing demand for innovative products, there is a need for methods that can streamline the development process and optimize both the product’s specifications and the associated costs. An additional concern that is not necessary on every formulator’s radar, but is pertinent for this work, is the development of novice formulators. Emerging at the forefront of these challenges is the potential application of machine learning (ML) technologies, which offer a data-driven approach to formulation 3 , 4 by analyzing available patterns, 5 predicting efficacies, 5 and generating recipes. Although incorporating ML into personal care product development must be done with discretion (or avoided if considered too risky), finding the strengths and weaknesses of ML in such a task is of great importance to ensure that formulators are aware of all available tools. 4 It would be considered irresponsible to use ML without an understanding of its limitations, but it would also be irresponsible to avoid the use of ML if it could bring value to consumers or other stakeholders. Understanding ML capabilities and limitations depends on understanding the source of the generated responses from an ML system (MLS). 4 The efficacy of any ML-driven formulation strategy would depend on the robust and relevant nature of the ML data, 4 – 7 and appropriate algorithms. 4 These are factors dependent on the MLS and are essentially unmodifiable by the users when using what is currently available and free. However, these are reasonable limitations to experiment with as a formulator whose initial use or exploration into the helpfulness of ML technologies will most likely begin with minimal investment. Of the user dependent factors, arguably the most impactful is the prompt provided to the MLS. 8 , 9 Prompt consistency to maintain the limited memory of the MLS is critical. An additional concern is the apparent fatigue when trying to undergo multiple iterative fine-tuning interactions with the MLS. The fast-changing nuances that drive the personal care industry, attention to safety regulations, consumer tastes, and changing trends, are all factors that complicate the integration of ML into the formulation process. It is questionable as to whether so many factors will be manageable by ML. Constantly ensuring and noting these factors when generating prompts is critical to improve the usefulness of ML. In this paper, the aim is to explore the results and a bit of the methodologies behind the use of ML in the production and improvement of a 10-minute hydrating facemask formula in the hands of student formulators. The goal is to see if student formulators can successfully produce a quality product following the guidance of ML. The work here focuses on the utilization of ML for the substitution of ingredients from a base recipe that enhances qualitative or quantitative specification of a product. This paper provides the reader insight into the preliminary use of ML for cosmetic formulations and the potential obstacles of this new technology in the hands of novice student formulators. The resulting success of the student formulators will be based upon the meeting of qualitative and semi quantitative product specifications relative to the number of required iterations and the amount of improvement in specifications with further iterations. The results of this work are valuable to the perspective of formulators, formulation chemists and anybody involved in the training student in laboratory settings. These results would also lend insight into the value of courses where students are set in exploratory laboratory settings. Such settings can be very impowering to students allowing them to explore and build unlocked talent in a safe space with a real-world product as a potential outcome. There is great potential in identifying if ML can safely allow more students the opportunity to be involved in the creation of something new in the process of becoming familiar with the roles and different components of a base recipe. Methods Machine Learning Preface Because of the general unease amongst many communities about the use of ML in experimentation, it is important to clarify the role of ML in these experiments. The use of ML in scientific studies is resulting in calls for identifying rigorous parameters in ML experiments to ensure repeatability and the scientific nature of the work 4 . For example, generally the dependent variable in an experiment is chosen, and an independent variable is what results upon running the experiment on the dependent variable. If in an experiment that uses ML has a dependent variable which is placed into a prompt and an independent variable is collected, the reproducibility and scientific nature of this work would be questionable as the variables for the MLS are not identifiable nor repeatable. Thus, in experiments where ML provides the ultimate independent variable, the nature of the work needs to be verified as repeatable and following the scientific method. However, in this work, ML responses to prompts are not being taken as data points but more as dependent variables that are being checked by product formation to see if ML provides useful direction to the improvement of the qualitative and semi-quantitative specifications. Therefore, the resulting independent variable is the specifications of the created product and not what the ML reports. Thus, the fitness of the product is the data used in this paper to determine the usefulness of ML as a tool to improve the formulation process in the hands of student formulators. The formulation process can be defined for this work as the initial recipe being taken all the way to the final product. An improvement in the formulation process would be the attaining of a product faster or attaining a resulting iteration of a product with improved specification when using ML produced directions. Although a significant portion of this work is seen as qualitative, or semi quantitative, mush of the decisions about the quality of personal care products occupies such a space. Observing the quality of the product in terms of the specifications as the formulation process proceeds is an indicator of the effectiveness of ML in training of the novice student formulators. This research concentrates on using ML to propose changes to an initial product formula and to enhance the qualitative characteristics of cosmetics products with a minor exploration into cost reduction. The degree of focus on these factors is not based on relative interest in these factors but on the relative effectiveness of ML to contribute to improving these factors. The MLS utilized was ChatGPT, which is currently free and open source to the public. 10 Although other MLS were considered, the most likely contenders considered were: 1) Agent GPT: https://agentgpt.reworkd.ai/ , 2) Literally Anything https://www.literallyanything.io/ , and 3) Ora: https://ora.ai/dashboard . ChatGPT 3.5 was chosen because of its popularity, familiarity, and because it is currently always free. It is unlikely, however, that it would have performed better than the other three platforms and possibly many others. Machine Learning for Formula Generation The ML program utilized was ChatGPT 3.5, a ML language model based on the GPT-3.5 architecture with a professed prompt limit of approximately 600 words. 11 As a natural language processing model, ChatGPT excels at comprehending and generating human-like text in multiple languages. 12 The capabilities of ChatGPT version 3.5 include natural language understanding, text generation, content creation, answering questions, explaining concepts, and assisting in coding tasks. 10 Its knowledgebase is limited to information available up until January 2022, and it does not process human emotion or experiences. 10 The responses are generated based on text patterns, and verification is recommended. It is pointed out that the biggest shortcoming of ChatGPT application in chemical formulation and in chemistry in general is the lack of accurate mathematical calculations and the fabrications of false references and text content. 1 To compensate for these weaknesses and maintain repeatability when making decisions about formula changes based on ML feedback, Artrith et al. 4 recommend following a checklist when using ML tools in chemistry. They recommend that each researcher establishes a mechanism for clear results and reporting when using ML models like ChatGPT. 4 These mechanisms could be considered when documenting any formulation adjustments made by ML. This includes, most importantly, fact-checking suggestions by searching for the chemical names in publicly available libraries to ensure safety of the components and weighing the environmental and consumer experience impact before adopting any changes. Formula Synthesis This study began with a base formula: a standard recipe for a face mask that a local professional cosmetic chemist provided. The face mask is designed to be left on for 10-minutes and washed off with water. It is not an overnight face mask, but the formula ensures that leaving the mask on for a more extended period would not be an issue. Additionally, the product could be used as a hand or body moisturizing lotion instead of as a face mask. With this provided initial recipe, three sets of ML experiments were proposed. The first set of experiments involved using ML to find replacement components for ingredients in the provided recipe that were unavailable to the authors. This set of experiments dealt with finding prompts that effectively provided the MLS with an initial understanding of the ingredients, their relative concentrations, and the proposed role of the specific component. Identifying the role of the components was a subset of the first experiment. With a specified role, the MLS could locate a replacement component. As with all three sets of experiments, the first set of experiments was followed up by making various formulas for the product with the substituted ingredients and analyzing the resulting properties, specification, stability, and microbial resistance. The second set of experiments revolved around identifying formula replacements based on desired specifications. This process was heavily impacted by the prompts given to the MLS. Providing enough information in prompts to keep the MLS up to date through any back-and-forth conversations was critical. Again, this set of experiments was followed up by making various formulas for the product with the suggested ingredients and analyzing the resulting properties, specification, stability, and microbial resistance. The third set of experiments was focused on lowering the cost of the product while maintaining the quality with the desired specifications. The prompts in this set of experiments attempted to blend price findings through research outside the MLS with the feedback provided by the MLS. Throughout this study, approximately 20 different ML recipe iterations were generated, formulated, and challenged. At least 40 iterations were generated but were not created because of apparent discrepancies in the ML-provided suggestions. It became clear that ML needs to have a semi-trained eye overseeing its responses at all times. During this process, different formulators were trained in the production of the base formula and how to query ML to make changes to the formula based on specification enhancements or cost. Specifications and cost per item and batch were recorded. Some in-house code was written to correct any miscalculated ML suggestions. When ML professes to know something that is incorrect or provides a calculation or data that is incorrect, it is commonly known as a hallucination. The code was written to correct mathematical hallucinations associated with the formula suggestions (see additional materials for the code). Formula Testing To ensure the formulas created were stable and microbe resistant, challenges were performed regarding the stability and microbe propagation of the formula. These were used to get a general direction needed in terms of ingredients used in an ML prompt. The stability test was a one-week test at 50°C with no added humidity. The product was well covered, and dehydration resulted in condensation of water on the top of the container and an expansion of the container lid, indicating a slight increase in the pressure on the sample and a decrease in the product’s water content as the challenge proceeded. The micro challenge was performed by inoculating 5 mL of sterile Luria–Bertani medium (LB), with approximately 50 µL of the product to be challenged. The resulting mixture was swirled at 37°C for 24 hours, and the sample’s clarity and odor were observed. Bacterial growth was also detected in control samples to verify the method was reliable. If there was any question about whether microbial growth was occurring, the clarity of the solution was quantified using a visible spectrometer to monitor the solution’s absorbance at 600nm. Castro et al. 13 demonstrated that ChatGPT does not always provide curated data and that it struggles with understanding the context of chemistry prompts. During experimentation, the ML-created recipe iterations were also provided in a format that did not translate directly to lab work. To test these recipe iterations, a code was created that translated the ML recipe formula into percentages and grams to ensure the required mass of the suggested substitution and to verify the desired total mass of material was created (see additional materials for code). Testing Different Iterations Ten novice, student formulators worked in groups of two or three people, with different groups creating the ML iterations of the recipe, sometimes in duplicate. Iterations were created by one of the various groups; the resulting product was tested for specifications by a different group. The specification testing was consistently done by the same individual to help maintain specification integrity. The qualitative properties of the formula that were tested were odor, color, pH, viscosity, skin hydration, after-wash effect, and any other noticeable changes between recipes. Quality control testing for microbial growth and stability was then conducted. Upon completion of testing, each iteration underwent one of two steps: 1) if the formula had undergone undesirable changes, it would be returned for additional ML adjustments or 2) if the formula had undergone favorable changes, it would be put before the remaining formulators (n = 10) and a small set of consumers (n = 4) for additional feedback. The additional feedback helped direct future decisions about changes to the recipe, which was then also returned for further ML iterations. Results Figure 1 is a flow chart for some of the iterations. It shows the initially provided recipe and the adjusted base recipe based on the initial ML suggestions from the available library of materials. The 10 additional iterations included in Fig. 1 show some of the changes made per the ML suggestions based on a desire to improve specifications. A total of approximately 25 different ML driven iterations were performed. Table 1 shows the qualitative and quantitative specifications that were collected on 17 of the resulting products. When specifications were not measured, it was usually because of an egregious result that made further evaluation unnecessary. All 17 samples in Table 1 passed the relatively straight forward micro challenge, but not all 17 samples passed the stability challenge. Figure 2 provides the reader an opportunity to visualize six samples to get a feel for the resulting face mask product before (top two rows) and after (bottom two rows) stability testing. The second sample from the left in Fig. 2 (which is labeled sample 5 from Table 1 ) is an example of a product that had a strong separation after the stability challenge. The viscosities reported in Table 1 were taken with an NDJ-5S digital viscometer immediately after the probe was placed into the sample. The viscosity of the samples was taken again after the probe had been in the sample for two minutes. The percentage of the difference between the initial measurement and the measurement at two minutes is reported in parentheses in the viscosity column of Table 2 as well. This percentage provides quantitative information as to the Newtonian nature of the product. A large percentage change is an indication of a more non-Newtonian material. The differences in the Newtonian nature of the samples account for the quantitative viscosities not matching perfectly with the qualitative rating of the viscosities. Table 1 Results of formula iterations. Specifications were determined by the same formulators to improve qualitative consistency. The ideal rating is in the column heading in parenthesis. Specifications which were not determined (ND) due to egregious product failure are indicated. Sample Number Brief Description Odor (++) Viscosity (+++) Appearance (+) Skin hydration (++) After effect (+) Viscosity in Pa s − 1 (% change after 2 min) % change price per gram (Base cost $ 0.034/g) 1) Base + -- + ND ND ND 0% 2) XG as emulsifier/ viscosity modifier 10% total weight +++ +++++ ++ ++ + ND -20% 3) XG changed to 2% total weight +++ ++++ ++ ++ + 30.0 (3.2) -50% 4) GG as emulsifier/viscosity modifier 0.1% total weight + - + + - ND -51% 5) GG changed to 1% total weight + ++ + + - ND -51% 6) GG changed to 1.2% total weight +++ ++++ + + ++ ND -50% 7) Siligel as emulsifier/ viscosity modifier 2% total weight + ++ + ++ + 67.4 (76) -12% 8) XG & GG 50/50 mixture 1.2% total weight + ++++ + + + 35.9 (19) -49% 9) Peppermint fragrance, Shea Butter as Skin Conditioning Agent +++ ++ + +++ - 22.6 (17.9) -54% 10) Siligel as emulsifier/viscosity modifier 3% total weight +++ +++ + - ++ ND 9% 11) Witch Hazzel as humectant + ++++ + - - 39.5 (4.5) 27% 12) Changed SLS to Polyglucose as emulsifying agent +++ +++ + - - 62.7 (52) 7% 13) Peg A Dimethicone w/ Polyglucose - ++ + + - 57.0 (15) -1% 14) Coco Betain w/ Polyglucose ND ND ND ND ND 66.0 (35) 41% 15) Replace Siligel w/ Carbomer + +++++ - ND ND 30.1 (37) -57% 16) Optiphen as preservative + +++ + ND ND 23.0 (36) -1% 17) Hyaluronic Acid as active ++ +++ + ++ + 40.5 (7.2) 10% 18) Aloe added as a humectant ND -- ND ND ND ND 41% Seeking Substitutions The initial recipe provided as a base recipe contained ingredients that were not immediately available to the formulators. Thus, the first experimental question was to see if ML could produce an adjusted base recipe from the library of ingredients available to the formulators. The originally provided base recipe along with the current library of ingredients was provided in a prompt to ML, and suggestions as replacements for the missing components were sought. ML provided a new adjusted base recipe, and the formula was created. The product was lacking in many ways, but it provided a great boost in productivity, creating a quick place to start the development process. It would have to be said that ML was extremely good at this step, and it felt like this was the most empowering result. ML was able to pull together a product that had qualitatively measurable specifications from a relatively limited, although diverse, library of materials. At this point, the formulation process could turn to specific improvements. From the conception of the product to the step of having a product that was ready for revision, it felt like there was a considerable benefit to using ML. Once further adjustments were needed to address identified specifications with materials in the current library or with novel materials, the process slowed down. Gains on the product were not always attained when ML was asked to provide a product that had improved specifications. Usually, the new product had a change in the desired specification, which is a positive, but it was not always an improvement. Additionally, the suggestions became less impressive and less exploratory in nature. The suggestions were more predictable to more seasoned formulators. A comment should be made as well regarding the direction sought from ML as to the amount that should be added to achieve the desired specification. The amount of the substances to be added was rarely provided by ML directly, even if the rest of the recipe was provided. Instead, responses about following manufacture’s recommended amounts would result. Further insight could be gained by asking the same question from a different approach, but it was quickly clear that either not enough information was available, or ML did not have the ability to parse all of the necessary variables in the particular situation to provide a reliable amount of a replacement ingredient. Therefore, to complete such a process, multiple rounds were necessary to ascertain if the provided substitution was really not an acceptable substitution, or if its concentration was not correct to get the desired change. This began to feel like the ML was slowing things down. Formulators wanted to apply their personal new ideas before exploring the ML suggestions because the new products that were made were easily characterizable, and smaller gains seemed to be made in this back-and-forth situation. The third set of experiments then explored improving the cost of the product by asking ML to provide replacement options for higher cost ingredients. ML seemed confident in its suggestions, however it again quickly felt like ML was out of its element when asked to gather information on pricing. This is not necessarily the fault of ML or a hallucination of ML, but it seemed to be another scenario that there were too many variables to account for that could change too frequently when wanting to identify the price of an ingredient. General information about prices of materials were generally reliable, but specifics were not robust enough to provide reliable information compared to internet searches. Thus, ML suggestions needed to be double checked with internet and vendor catalog searches to verify pricing. The contribution of ML was not easily evident in these price-driven experiments. This third set of experiments ended up having the same feeling as the second experiment in which the mass percent of the suggested substitution was always in question. Thus, any lack of specification improvement or maintenance, in the case of attempting a cost reduction attempt, required many iterations to verify. Hence, this did not raise the ability of ML above that of a simple internet search. Machine Learning ML is always quick to offer many suggestions to adjust cosmetic recipes when provided specific prompts about the type of product being formulated. However, it is important to note that the ability of ML to retain information from previous interactions, even within the same chat session, is limited. This means that while it excels in ideas for quick adjustments based on current input, and often will behave as if it has retained previous information from recent conversations, it is limited in its ability to leverage information from earlier discussions or remember past formulations to make more complex decisions based on previous results. Thus, it relies heavily on the immediate prompts and information provided in the immediate prompt. This makes it less able to contribute intelligently to the usual gradual adjustments in the formulation process. Additionally, there seems to be an issue with ML as far as the attention given to the prompts. There is a mentioned limit to prompt size, however, there also seems to be an internal limitation of ML to gather or weight the beginning portion of a longer prompt with the latter portion of the prompt. It seems that Chat GTP, for example, will not always identify that a given prompt exceeds the specified limit, and it will end up addressing only the latter part of the prompt. It is as if the limit on the prompt size starts with the end of the prompt to the limit of words or symbols as it moves back to the beginning of the prompt. This is just an observation from this work and is not referenced in any other work. Thus, the value of this specific finding should be considered accordingly. Where ML can excel in its ability to provide quick suggestions for adjusting cosmetic recipes based on given parameters, it is relatively unreliable in terms of mathematical accuracy. 1 ML particularly falls short in making decisions about necessary amounts for synthesizing cosmetic products. While it can provide estimations and suggestions, its calculations are frequently incorrect. The suggestions are not necessarily incorrect in terms of meeting desired specifications, but ML is not reliable when asked to determine percent composition or the appropriate mass to add to reach a desired mass percent. Therefore, it is essential for formulators to double-check and validate these calculations to ensure the accuracy and safety of the final product. ML does have access to chemistry knowledge 7 and can piece much of the data together to provide simple answers and provide enough feedback to easily trick a novice into thinking it does have the ability to project more complex answers to questions with more variables. However, as mentioned above, provided answers to complex problems are usually incorrect. Unfortunately, sometimes the inaccuracies in the answers require expertise knowledge to identify. While it is challenging to quantify the depth of knowledge and abilities of ML, providing a little context in the provided prompt, such as explaining the project at hand, enables ML to make more connections with its database and thus offer more suggestions on how to proceed. These suggestions can include commonly used chemicals in personal care products. For instance, when presented with the task of enhancing a specific aspect of a base recipe for a face mask, the MLS was able to generate a suggestion that would be helpful to a novice formulator. A large, detailed prompt is required to get the ML to provide the allusion that it understands the issue at hand. The template that was generally used is provided below. PROMPT: “The following text is the recipe for a gel-based cosmetic face mask: Purified water: 70.65% EDTA: 0.15% Glycerin: 6.00% Butylene Glycol: 3.00% Xanthan gum: 0.6% Guar Gum: 0.6% AntiMicrobial Banana mixture: 0.90% Sodium Lauryl Sulfate: 1.00% Mango Butter: 3.00% Olive Oil: 5.00% Coconut Oil: 6.00% Vitamin E: 0.50% Polyglyceryl Oleate: 2.00% Papaya Banana: 0.10% Protein-Hyaluronate blend: 0.50% The desired yield of this recipe is 50 grams. From this face mask recipe, suggest a substitute for the Xanthan gum in the correct scaled mass amount that will give a creamy consistency.” ML RESPONSE: “Xanthan gum is often used in cosmetic and skincare formulations to provide a thickening and stabilizing effect. If you're looking for a substitute for xanthan gum that will give your face mask a creamy consistency, you can consider using Guar Gum as a replacement. Guar gum is a natural thickener and stabilizer commonly used in cosmetic products and can be substituted for xanthan gum. To determine the correct scaled mass amount of guar gum as a substitute for the 5 grams of xanthan gum, you can use a similar proportion. Here's the calculation: Original Xanthan Gum: 5 g Guar Gum Substitute: (5 g x Mass of Guar Gum) / Mass of Xanthan Gum To calculate the mass of Guar Gum, you can rearrange the equation: Mass of Guar Gum = (5 g x Mass of Guar Gum) / 5 g Mass of Guar Gum = Mass of Guar Gum So, you can substitute 5 grams of Guar Gum for the 5 grams of xanthan gum in your recipe to achieve a creamy consistency.” Although rudimentary, ML successfully answered the question and suggested a reasonable substitute for a thickener. It is also important to note that the formulators relied on a template prompt to achieve these results. However, in this specific case, the product's quality decreased when the two formulas were compared. This decrease in texture quality was because the product was too thick due to the materials not being able to be replaced on a 1:1 gram ratio. This is not difficult to fix but the ML aspect was not more helpful in this case than a simple internet search. As mentioned, ML is not proficient in the mathematical calculations expected for formulation creation. ML was unable to make a chemical substitution that maintained the desired mass yield and percentage of the chemical component. For that reason, the formulators generated a code that could readjust the ML suggested mass to the base recipe. This was a necessary intermediate step used to help maintain a consistent mass of product while adhering to the percentage of each component from the base recipe. This is an example of a necessary intervention to ensure that multi-variate ML suggestions were accurate or at least adjusted to be accurately applied. Regarding ML capabilities in chemical formulation, it was very successful in quickly allowing formulators in training to adjusting a base recipe of a cosmetic formula to achieve a stable starting point from a provided library. A sellable product emerged in just 2 iterations. In the hands of formulators in training ML was partially helpful in enhancing product characteristics, but the requirement for the users to have formulation training, albeit minimal, made the ML contribution more cumbersome. Thus, the efficiency of the second round of experiments decreased quickly from iteration to iteration. This lack of improvement seemed more than just a typical law of diminishing marginal returns scenario but more like an obstacle that was in the ML recommendation that could not be specifically identified (i.e. wrong amount or wrong role of the ingredient for the specification in the prompt). Regarding the usage of ML to accommodate cost calculations per unit or per batch, formulators did not trust that the suggestions provided by ML were moving the product in a cost-decreasing direction while maintaining the quality of previous iterations. In other words, because the second set of experiments were only marginally helpful after two or three iterations, adding an additional task of maintaining specifications while lowering a quantitative value (the cost) seemed to be too much for ML in the hands of new formulators. The iterations’ calculated cost did fluctuate, but this fluctuation was close to the magnitude of different vendor’s price fluctuation of the materials used, thus nullifying any real gains from ML proposals. Conclusion ML can serve as a valuable aid in helping new formulators lear how to adjusting cosmetic recipes to achieve a high quality product very quickly. The effectiveness of ML contributing to deeper learning of new formulators is still contingent upon the supervision of a trained formulator. While minimally trained formulators can benefit from ML guidance, expert formulators may find the ML suggestions are limited in terms of novel insights due to the limitations of data sources available to the MLS. The level of training and expertise of the formulator significantly influences the extent to which ML contributes effectively to the formulation process. There exists a balance where undertrained individuals can receive valuable assistance, yet seasoned formulators might not find groundbreaking insights, as the ML available database may not be as comprehensive or up to date as their wealth of experience and knowledge. In future studies, it is suggested that the experiments be repeated with a larger n value for the consumer testers and use consumer feedback more in the loop of the iterations. However, it may be too early to do this as the feedback may not be as focused as a professional formulator’s experience. Another suggestion could be to quantify the qualitative specifications with a ruberic, which will also come with its own set of problems. By assigning each quality a quantitative measurement on a fixed scale, future studies could compose an average overall “score” for each formulation. However, locking in on such a score could actually lead the formula to a local minimum that is not actually ideal. Both of these previous suggestions are provided as being independent of ML. A true high risk proposal would be to see if ML could help quantify the resulting feedback provided by consumers. In such a case, ML would truly be producing the “usable” data. Although possibly highly rewarding, this could also come with disastrous effects given the track record of ML with quantifiable parameters such as mathematical calculations or cost reduction. As previously mentioned, such an endeavor would require strict adherence to mechanisms that ensure repeatability and adherence to the scientific method 4 . Future researchers will note the possibility of supporting ML with additional code for a specific recipe or laboratory setting to convert ML responses into usable data. Building these extraneous programs and utilizing them has potential to ensure that working with ML is more efficient and possibly expandable into different areas of formulation. Based on the results of the research, it can be concluded that ML shows potential in adjusting cosmetic formulations and providing suggestions for chemical substitutions in the hands of novice formulator. ML demonstrates some background knowledge in chemistry, particularly when given context in the chat prompt. However, it lacks a reliable memory within the same chat box and struggles with mathematical calculations for precise adjustments. For new formulators, the iterative process of recipe adjustments utilizing ML yielded mixed results, with varying levels of success in achieving the desired product specifications. Additionally, ML was not considered proficient or significantly helpful in making effective cost adjustments. Declarations Statements and Declarations This work was supported by internal funding from Utah Valley University, College of Science. The authors declare that no external funds, grants, or other support were received during the preparation of this manuscript. Competing Interests The authors have no relevant financial or non-financial interests to disclose. Author Contributions All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by all of the authors. The first draft of the manuscript was written by Hazel Ticas, followed by significant changes and contributions by all authors. The final draft was written by Daniel Scott. Subsequently all authors commented on the final versions of the manuscript. All authors read and approved the final manuscript. References Tyson, J. (2023). Shortcomings of ChatGPT. Journal of Chemical Education , 100 (8), 3098–3101. 10.1021/acs.jchemed.3c00361 . Lee, C. K. H., Choy, K. L., & Chan, Y. N. (2014). A knowledge-based ingredient formulation system for chemical product development in the personal care industry. Computers & Chemical Engineering , 65 , 40–53. https://doi.org/10.1016/j.compchemeng.2014.03.004 . Bannigan, P., Aldeghi, M., Bao, Z., Häse, F., Aspuru-Guzik, A., & Allen, C. (2021). Machine learning directed drug formulation development. Advanced Drug Delivery Reviews , 175 , 113806. https://doi.org/10.1016/j.addr.2021.05.016 . Artrith, N., Butler, K. T., Coudert, F. X., et al. (2021). Best practices in machine learning for chemistry. Nature Chemistry , 13 (6), 505–508. https://doi.org/10.1038/s41557-021-00716-z . Murray, J. D., Lange, J. J., Bennett-Lenane, H., et al. (2023). Advancing algorithmic drug product development: Recommendations for machine learning approaches in drug formulation. European Journal of Pharmaceutical Sciences , 191 , 106562. https://doi.org/10.1016/j.ejps.2023.106562 . Xiao, P., Zhang, X., Pan, W., et al. (2020). The Development of a Skin Image Analysis Tool by Using Machine Learning Algorithms. Cosmetics , 7 (3), 67. https://doi.org/10.3390/cosmetics7030067 . Shi, Y. F., Yang, Z. X., Ma, S. (2023). Machine Learning for Chemistry: Basics and Applications. Engineering . Published online July 31, https://doi.org/10.1016/j.eng.2023.04.013 . Paweł Korzyński, Mazurek, G., Krzypkowska, P., & Artur, K. (2023). Artificial intelligence prompt engineering as a new digital competence: Analysis of generative AI technologies such as ChatGPT. Entrepreneurial Business and Economics Review , 11 (3), 25–37. https://doi.org/10.15678/eber.2023.110302 . Aljanabi, M., Yaseen, M. G., Ali, A. H., & Mohammed, M. A. (2023). Prompt Engineering: Guiding the Way to Effective Large Language Models. Iraqi Journal for Computer Science and Mathematics , 4 (4), 151–155. https://doi.org/10.52866/ijcsm.2023.04.04.012 . OpenAI. Introducing ChatGPT. OpenAI Blog (2022). https://openai.com/blog/chatgpt . O’Sullivan, R. (2023). What are ChatGPT’s limits? Namecheap. https://www.namecheap.com/blog/what-are-chatgpts-limits/ . Flensted, T. (2023). How Many Languages Does ChatGPT Support? The Complete ChatGPT Language List. SEO.AI. https://seo.ai/bprlog/how-many-languages-does-chatgpt-support . Castro Nascimento, C. M., & Silva Pimentel, A. (2023). Do Large Language Models Understand Chemistry? A Conversation with ChatGPT. Journal of Chemical Information and Modeling , 63 (6), 1649–1655. 10.1021/acs.jcim.3c00285 . Additional Declarations No competing interests reported. Supplementary Files SupplementalMaterialsforUtilizingPubliclyAccessibleMachineLearningforPersonalCareProductFormulations161136am.docx Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-3867796","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":267501659,"identity":"017daabb-c89f-4741-a330-e2610863e558","order_by":0,"name":"McKinnley Bilbao","email":"","orcid":"","institution":"Utah Valley University","correspondingAuthor":false,"prefix":"","firstName":"McKinnley","middleName":"","lastName":"Bilbao","suffix":""},{"id":267501660,"identity":"bb8d5413-9c21-45cc-97f1-4ae773bbe936","order_by":1,"name":"Tomas Carmona","email":"","orcid":"","institution":"Utah Valley University","correspondingAuthor":false,"prefix":"","firstName":"Tomas","middleName":"","lastName":"Carmona","suffix":""},{"id":267501661,"identity":"032a493a-6a89-4424-932b-7fbd091987be","order_by":2,"name":"Morgan Covarrubias","email":"","orcid":"","institution":"Utah Valley University","correspondingAuthor":false,"prefix":"","firstName":"Morgan","middleName":"","lastName":"Covarrubias","suffix":""},{"id":267501662,"identity":"52506e51-91c0-4afd-aa4d-f632e31d4474","order_by":3,"name":"Alex Goslin","email":"","orcid":"","institution":"Utah Valley University","correspondingAuthor":false,"prefix":"","firstName":"Alex","middleName":"","lastName":"Goslin","suffix":""},{"id":267501663,"identity":"fcbaf471-d5b8-4f76-82d9-fb45bd1335b7","order_by":4,"name":"Katherine Judge","email":"","orcid":"","institution":"Utah Valley University","correspondingAuthor":false,"prefix":"","firstName":"Katherine","middleName":"","lastName":"Judge","suffix":""},{"id":267501664,"identity":"93900fec-4156-4eda-89dc-afb7d9816fb5","order_by":5,"name":"Garland Munn","email":"","orcid":"","institution":"Utah Valley University","correspondingAuthor":false,"prefix":"","firstName":"Garland","middleName":"","lastName":"Munn","suffix":""},{"id":267501665,"identity":"9b37dc2a-026d-4800-a39e-81fceae1340b","order_by":6,"name":"Hazel Ticas","email":"","orcid":"","institution":"Utah Valley University","correspondingAuthor":false,"prefix":"","firstName":"Hazel","middleName":"","lastName":"Ticas","suffix":""},{"id":267501666,"identity":"23caf213-0e59-402f-bf30-b49601fd4593","order_by":7,"name":"Abe Tonioli","email":"","orcid":"","institution":"Utah Valley University","correspondingAuthor":false,"prefix":"","firstName":"Abe","middleName":"","lastName":"Tonioli","suffix":""},{"id":267501667,"identity":"6c8537bf-f7d5-458f-a447-3a00d1a4f52e","order_by":8,"name":"Collin Tuttle","email":"","orcid":"","institution":"Utah Valley University","correspondingAuthor":false,"prefix":"","firstName":"Collin","middleName":"","lastName":"Tuttle","suffix":""},{"id":267501668,"identity":"0b675514-e4ff-4766-929f-fe336a2704cd","order_by":9,"name":"Caitlin West","email":"","orcid":"","institution":"Utah Valley University","correspondingAuthor":false,"prefix":"","firstName":"Caitlin","middleName":"","lastName":"West","suffix":""},{"id":267501669,"identity":"1096f03a-de31-480b-99c5-14230e03e3ed","order_by":10,"name":"Daniel Scott","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAA30lEQVRIiWNgGAWjYBACeyA+wFAA5X2AUAZ4tRg2gLSA1TAzNs4gRovBAbgaZsZmHqK0HD978MAHAwZ7/vb+449t2w4nNrA3b5PAq+VMXsLBGQYMiTPOHGZszgVp4TlWhl/LgRyDwzwGDAkGEskgLWmJDRI5Zvi1nH9jcPgP0GFgLZYgLfJvCGi5AbQF6F/GDSAtjG02QFt48GsxnPHG4GCPgQTIL4Yze87ZGLfxpBVb4NNiz59j/OFHhQ0wxBoffPhRJiHbz3544w18WqAAySVsRCgfBaNgFIyCUUAAAADQAkjNWWzqxwAAAABJRU5ErkJggg==","orcid":"","institution":"Utah Valley University","correspondingAuthor":true,"prefix":"","firstName":"Daniel","middleName":"","lastName":"Scott","suffix":""}],"badges":[],"createdAt":"2024-01-15 21:44:07","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-3867796/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-3867796/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":49810704,"identity":"d65c69d8-7b48-40a8-8507-e1025414d7c9","added_by":"auto","created_at":"2024-01-18 11:52:43","extension":"png","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":102833,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eFlow chart of a number of ingredient modifications across several iterations. Ingredient modifications were based on ML suggestions that were specification driven.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"Figure1FlowChartFaceMask2024.png","url":"https://assets-eu.researchsquare.com/files/rs-3867796/v1/8f6eb3b210e90f7c91c08b70.png"},{"id":49810706,"identity":"5e6482d0-f0fd-4bb6-bf6f-ba0f8e2e527a","added_by":"auto","created_at":"2024-01-18 11:52:43","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":1005012,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eSix example iterations before (top two rows) and after (bottom two rows) a stability challenge at 50ºC for 1 week. Sample numbers from left to right (numbers from Table 1) are 3, 5, 9, 10, 11, and 12. Sample 5 had a strong separation after the stability challenge.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"Figure2FaceMaskSamplesFaceMask2024.png","url":"https://assets-eu.researchsquare.com/files/rs-3867796/v1/d7712be419fc8a92474c4833.png"},{"id":53701152,"identity":"8b93338d-a541-4068-8ed7-207b89d1ff07","added_by":"auto","created_at":"2024-03-29 05:23:31","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1496728,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-3867796/v1/57ab5db1-3ab7-4beb-a0b2-3c404ffaba65.pdf"},{"id":49810705,"identity":"98bb8bd6-b4e3-4eb7-8909-642d4ff3b826","added_by":"auto","created_at":"2024-01-18 11:52:43","extension":"docx","order_by":5,"title":"","display":"","copyAsset":false,"role":"supplement","size":44004,"visible":true,"origin":"","legend":"","description":"","filename":"SupplementalMaterialsforUtilizingPubliclyAccessibleMachineLearningforPersonalCareProductFormulations161136am.docx","url":"https://assets-eu.researchsquare.com/files/rs-3867796/v1/8ecf15d4af9b1d2a0ab75cdf.docx"}],"financialInterests":"No competing interests reported.","formattedTitle":"Utilizing Publicly Accessible Machine Learning for Training Student Formulators in Personal Care Product Formulation: Specification-Driven and Cost-Conscious Experiments","fulltext":[{"header":"Introduction","content":"\u003cp\u003eThe landscape of personal care product development is a complex and multifaceted one, where the interplay of several factors, such as ingredient choices,\u003csup\u003e\u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2\u003c/span\u003e\u003c/sup\u003e manufacturing processes, and consumer demands coalesce. Formulation of these products demands a careful balance of ingredients to ensure desired product efficacy, safety, and appeal to consumers. With rising costs and an increasing demand for innovative products, there is a need for methods that can streamline the development process and optimize both the product\u0026rsquo;s specifications and the associated costs. An additional concern that is not necessary on every formulator\u0026rsquo;s radar, but is pertinent for this work, is the development of novice formulators. Emerging at the forefront of these challenges is the potential application of machine learning (ML) technologies, which offer a data-driven approach to formulation\u003csup\u003e\u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e3\u003c/span\u003e,\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e by analyzing available patterns,\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e predicting efficacies,\u003csup\u003e\u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e5\u003c/span\u003e\u003c/sup\u003e and generating recipes.\u003c/p\u003e \u003cp\u003eAlthough incorporating ML into personal care product development must be done with discretion (or avoided if considered too risky), finding the strengths and weaknesses of ML in such a task is of great importance to ensure that formulators are aware of all available tools.\u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e It would be considered irresponsible to use ML without an understanding of its limitations, but it would also be irresponsible to avoid the use of ML if it could bring value to consumers or other stakeholders. Understanding ML capabilities and limitations depends on understanding the source of the generated responses from an ML system (MLS). \u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eThe efficacy of any ML-driven formulation strategy would depend on the robust and relevant nature of the ML data,\u003csup\u003e\u003cspan additionalcitationids=\"CR5 CR6\" citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u0026ndash;\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u003c/sup\u003e and appropriate algorithms.\u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e These are factors dependent on the MLS and are essentially unmodifiable by the users when using what is currently available and free. However, these are reasonable limitations to experiment with as a formulator whose initial use or exploration into the helpfulness of ML technologies will most likely begin with minimal investment.\u003c/p\u003e \u003cp\u003eOf the user dependent factors, arguably the most impactful is the prompt provided to the MLS.\u003csup\u003e\u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e8\u003c/span\u003e,\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e\u003c/sup\u003e Prompt consistency to maintain the limited memory of the MLS is critical. An additional concern is the apparent fatigue when trying to undergo multiple iterative fine-tuning interactions with the MLS. The fast-changing nuances that drive the personal care industry, attention to safety regulations, consumer tastes, and changing trends, are all factors that complicate the integration of ML into the formulation process. It is questionable as to whether so many factors will be manageable by ML. Constantly ensuring and noting these factors when generating prompts is critical to improve the usefulness of ML.\u003c/p\u003e \u003cp\u003eIn this paper, the aim is to explore the results and a bit of the methodologies behind the use of ML in the production and improvement of a 10-minute hydrating facemask formula in the hands of student formulators. The goal is to see if student formulators can successfully produce a quality product following the guidance of ML. The work here focuses on the utilization of ML for the substitution of ingredients from a base recipe that enhances qualitative or quantitative specification of a product. This paper provides the reader insight into the preliminary use of ML for cosmetic formulations and the potential obstacles of this new technology in the hands of novice student formulators. The resulting success of the student formulators will be based upon the meeting of qualitative and semi quantitative product specifications relative to the number of required iterations and the amount of improvement in specifications with further iterations.\u003c/p\u003e \u003cp\u003eThe results of this work are valuable to the perspective of formulators, formulation chemists and anybody involved in the training student in laboratory settings. These results would also lend insight into the value of courses where students are set in exploratory laboratory settings. Such settings can be very impowering to students allowing them to explore and build unlocked talent in a safe space with a real-world product as a potential outcome. There is great potential in identifying if ML can safely allow more students the opportunity to be involved in the creation of something new in the process of becoming familiar with the roles and different components of a base recipe.\u003c/p\u003e"},{"header":"Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e \u003ch2\u003eMachine Learning Preface\u003c/h2\u003e \u003cp\u003eBecause of the general unease amongst many communities about the use of ML in experimentation, it is important to clarify the role of ML in these experiments. The use of ML in scientific studies is resulting in calls for identifying rigorous parameters in ML experiments to ensure repeatability and the scientific nature of the work\u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e. For example, generally the dependent variable in an experiment is chosen, and an independent variable is what results upon running the experiment on the dependent variable. If in an experiment that uses ML has a dependent variable which is placed into a prompt and an independent variable is collected, the reproducibility and scientific nature of this work would be questionable as the variables for the MLS are not identifiable nor repeatable.\u003c/p\u003e \u003cp\u003eThus, in experiments where ML provides the ultimate independent variable, the nature of the work needs to be verified as repeatable and following the scientific method. However, in this work, ML responses to prompts are not being taken as data points but more as dependent variables that are being checked by product formation to see if ML provides useful direction to the improvement of the qualitative and semi-quantitative specifications. Therefore, the resulting independent variable is the specifications of the created product and not what the ML reports. Thus, the fitness of the product is the data used in this paper to determine the usefulness of ML as a tool to improve the formulation process in the hands of student formulators. The formulation process can be defined for this work as the initial recipe being taken all the way to the final product. An improvement in the formulation process would be the attaining of a product faster or attaining a resulting iteration of a product with improved specification when using ML produced directions. Although a significant portion of this work is seen as qualitative, or semi quantitative, mush of the decisions about the quality of personal care products occupies such a space. Observing the quality of the product in terms of the specifications as the formulation process proceeds is an indicator of the effectiveness of ML in training of the novice student formulators.\u003c/p\u003e \u003cp\u003eThis research concentrates on using ML to propose changes to an initial product formula and to enhance the qualitative characteristics of cosmetics products with a minor exploration into cost reduction. The degree of focus on these factors is not based on relative interest in these factors but on the relative effectiveness of ML to contribute to improving these factors. The MLS utilized was ChatGPT, which is currently free and open source to the public.\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e Although other MLS were considered, the most likely contenders considered were: 1) Agent GPT: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://agentgpt.reworkd.ai/\u003c/span\u003e\u003cspan address=\"https://agentgpt.reworkd.ai/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e, 2) Literally Anything \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.literallyanything.io/\u003c/span\u003e\u003cspan address=\"https://www.literallyanything.io/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e, and 3) Ora: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://ora.ai/dashboard\u003c/span\u003e\u003cspan address=\"https://ora.ai/dashboard\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e. ChatGPT 3.5 was chosen because of its popularity, familiarity, and because it is currently always free. It is unlikely, however, that it would have performed better than the other three platforms and possibly many others.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec4\" class=\"Section2\"\u003e \u003ch2\u003eMachine Learning for Formula Generation\u003c/h2\u003e \u003cp\u003eThe ML program utilized was ChatGPT 3.5, a ML language model based on the GPT-3.5 architecture with a professed prompt limit of approximately 600 words.\u003csup\u003e\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e11\u003c/span\u003e\u003c/sup\u003e As a natural language processing model, ChatGPT excels at comprehending and generating human-like text in multiple languages.\u003csup\u003e\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e12\u003c/span\u003e\u003c/sup\u003e The capabilities of ChatGPT version 3.5 include natural language understanding, text generation, content creation, answering questions, explaining concepts, and assisting in coding tasks.\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eIts knowledgebase is limited to information available up until January 2022, and it does not process human emotion or experiences.\u003csup\u003e\u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e10\u003c/span\u003e\u003c/sup\u003e The responses are generated based on text patterns, and verification is recommended. It is pointed out that the biggest shortcoming of ChatGPT application in chemical formulation and in chemistry in general is the lack of accurate mathematical calculations and the fabrications of false references and text content.\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u003c/sup\u003e\u003c/p\u003e \u003cp\u003eTo compensate for these weaknesses and maintain repeatability when making decisions about formula changes based on ML feedback, Artrith et al. \u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e recommend following a checklist when using ML tools in chemistry. They recommend that each researcher establishes a mechanism for clear results and reporting when using ML models like ChatGPT.\u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e These mechanisms could be considered when documenting any formulation adjustments made by ML. This includes, most importantly, fact-checking suggestions by searching for the chemical names in publicly available libraries to ensure safety of the components and weighing the environmental and consumer experience impact before adopting any changes.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec5\" class=\"Section2\"\u003e \u003ch2\u003eFormula Synthesis\u003c/h2\u003e \u003cp\u003eThis study began with a base formula: a standard recipe for a face mask that a local professional cosmetic chemist provided. The face mask is designed to be left on for 10-minutes and washed off with water. It is not an overnight face mask, but the formula ensures that leaving the mask on for a more extended period would not be an issue. Additionally, the product could be used as a hand or body moisturizing lotion instead of as a face mask.\u003c/p\u003e \u003cp\u003eWith this provided initial recipe, three sets of ML experiments were proposed. The first set of experiments involved using ML to find replacement components for ingredients in the provided recipe that were unavailable to the authors. This set of experiments dealt with finding prompts that effectively provided the MLS with an initial understanding of the ingredients, their relative concentrations, and the proposed role of the specific component. Identifying the role of the components was a subset of the first experiment. With a specified role, the MLS could locate a replacement component. As with all three sets of experiments, the first set of experiments was followed up by making various formulas for the product with the substituted ingredients and analyzing the resulting properties, specification, stability, and microbial resistance.\u003c/p\u003e \u003cp\u003eThe second set of experiments revolved around identifying formula replacements based on desired specifications. This process was heavily impacted by the prompts given to the MLS. Providing enough information in prompts to keep the MLS up to date through any back-and-forth conversations was critical. Again, this set of experiments was followed up by making various formulas for the product with the suggested ingredients and analyzing the resulting properties, specification, stability, and microbial resistance.\u003c/p\u003e \u003cp\u003eThe third set of experiments was focused on lowering the cost of the product while maintaining the quality with the desired specifications. The prompts in this set of experiments attempted to blend price findings through research outside the MLS with the feedback provided by the MLS.\u003c/p\u003e \u003cp\u003eThroughout this study, approximately 20 different ML recipe iterations were generated, formulated, and challenged. At least 40 iterations were generated but were not created because of apparent discrepancies in the ML-provided suggestions. It became clear that ML needs to have a semi-trained eye overseeing its responses at all times. During this process, different formulators were trained in the production of the base formula and how to query ML to make changes to the formula based on specification enhancements or cost. Specifications and cost per item and batch were recorded. Some in-house code was written to correct any miscalculated ML suggestions. When ML professes to know something that is incorrect or provides a calculation or data that is incorrect, it is commonly known as a hallucination. The code was written to correct mathematical hallucinations associated with the formula suggestions (see additional materials for the code).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec6\" class=\"Section2\"\u003e \u003ch2\u003eFormula Testing\u003c/h2\u003e \u003cp\u003eTo ensure the formulas created were stable and microbe resistant, challenges were performed regarding the stability and microbe propagation of the formula. These were used to get a general direction needed in terms of ingredients used in an ML prompt. The stability test was a one-week test at 50\u0026deg;C with no added humidity. The product was well covered, and dehydration resulted in condensation of water on the top of the container and an expansion of the container lid, indicating a slight increase in the pressure on the sample and a decrease in the product\u0026rsquo;s water content as the challenge proceeded. The micro challenge was performed by inoculating 5 mL of sterile Luria\u0026ndash;Bertani medium (LB), with approximately 50 \u0026micro;L of the product to be challenged. The resulting mixture was swirled at 37\u0026deg;C for 24 hours, and the sample\u0026rsquo;s clarity and odor were observed. Bacterial growth was also detected in control samples to verify the method was reliable. If there was any question about whether microbial growth was occurring, the clarity of the solution was quantified using a visible spectrometer to monitor the solution\u0026rsquo;s absorbance at 600nm.\u003c/p\u003e \u003cp\u003eCastro et al. \u003csup\u003e\u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e13\u003c/span\u003e\u003c/sup\u003e demonstrated that ChatGPT does not always provide curated data and that it struggles with understanding the context of chemistry prompts. During experimentation, the ML-created recipe iterations were also provided in a format that did not translate directly to lab work. To test these recipe iterations, a code was created that translated the ML recipe formula into percentages and grams to ensure the required mass of the suggested substitution and to verify the desired total mass of material was created (see additional materials for code).\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003eTesting Different Iterations\u003c/h2\u003e \u003cp\u003eTen novice, student formulators worked in groups of two or three people, with different groups creating the ML iterations of the recipe, sometimes in duplicate. Iterations were created by one of the various groups; the resulting product was tested for specifications by a different group. The specification testing was consistently done by the same individual to help maintain specification integrity. The qualitative properties of the formula that were tested were odor, color, pH, viscosity, skin hydration, after-wash effect, and any other noticeable changes between recipes. Quality control testing for microbial growth and stability was then conducted. Upon completion of testing, each iteration underwent one of two steps: 1) if the formula had undergone undesirable changes, it would be returned for additional ML adjustments or 2) if the formula had undergone favorable changes, it would be put before the remaining formulators (n\u0026thinsp;=\u0026thinsp;10) and a small set of consumers (n\u0026thinsp;=\u0026thinsp;4) for additional feedback. The additional feedback helped direct future decisions about changes to the recipe, which was then also returned for further ML iterations.\u003c/p\u003e \u003c/div\u003e"},{"header":"Results","content":"\u003cp\u003eFigure \u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e is a flow chart for some of the iterations. It shows the initially provided recipe and the adjusted base recipe based on the initial ML suggestions from the available library of materials. The 10 additional iterations included in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e show some of the changes made per the ML suggestions based on a desire to improve specifications. A total of approximately 25 different ML driven iterations were performed.\u003c/p\u003e \u003cp\u003eTable\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e shows the qualitative and quantitative specifications that were collected on 17 of the resulting products. When specifications were not measured, it was usually because of an egregious result that made further evaluation unnecessary. All 17 samples in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e passed the relatively straight forward micro challenge, but not all 17 samples passed the stability challenge. Figure\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e provides the reader an opportunity to visualize six samples to get a feel for the resulting face mask product before (top two rows) and after (bottom two rows) stability testing. The second sample from the left in Fig.\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e2\u003c/span\u003e (which is labeled sample 5 from Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e) is an example of a product that had a strong separation after the stability challenge.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cp\u003eThe viscosities reported in Table\u0026nbsp;\u003cspan refid=\"Tab1\" class=\"InternalRef\"\u003e1\u003c/span\u003e were taken with an NDJ-5S digital viscometer immediately after the probe was placed into the sample. The viscosity of the samples was taken again after the probe had been in the sample for two minutes. The percentage of the difference between the initial measurement and the measurement at two minutes is reported in parentheses in the viscosity column of Table\u0026nbsp;2 as well. This percentage provides quantitative information as to the Newtonian nature of the product. A large percentage change is an indication of a more non-Newtonian material. The differences in the Newtonian nature of the samples account for the quantitative viscosities not matching perfectly with the qualitative rating of the viscosities.\u003c/p\u003e \u003cp\u003e \u003cdiv class=\"gridtable\"\u003e\u003ctable float=\"Yes\" id=\"Tab1\" border=\"1\"\u003e \u003ccaption language=\"En\"\u003e \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e \u003cdiv class=\"CaptionContent\"\u003e \u003cp\u003eResults of formula iterations. Specifications were determined by the same formulators to improve qualitative consistency. The ideal rating is in the column heading in parenthesis. Specifications which were not determined (ND) due to egregious product failure are indicated.\u003c/p\u003e \u003c/div\u003e \u003c/caption\u003e \u003ccolgroup cols=\"8\"\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c1\" colnum=\"1\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c2\" colnum=\"2\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c3\" colnum=\"3\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c4\" colnum=\"4\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c5\" colnum=\"5\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c6\" colnum=\"6\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c7\" colnum=\"7\"\u003e\u003c/div\u003e \u003cdiv align=\"left\" class=\"colspec\" colname=\"c8\" colnum=\"8\"\u003e\u003c/div\u003e \u003cthead\u003e \u003ctr\u003e \u003cth align=\"left\" colname=\"c1\"\u003e \u003cp\u003eSample Number Brief Description\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c2\"\u003e \u003cp\u003eOdor (++)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c3\"\u003e \u003cp\u003eViscosity (+++)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c4\"\u003e \u003cp\u003eAppearance (+)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c5\"\u003e \u003cp\u003eSkin hydration (++)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c6\"\u003e \u003cp\u003eAfter effect (+)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c7\"\u003e \u003cp\u003eViscosity in Pa s\u003csup\u003e\u0026minus;\u0026thinsp;1\u003c/sup\u003e\u003c/p\u003e \u003cp\u003e(% change after 2 min)\u003c/p\u003e \u003c/th\u003e \u003cth align=\"left\" colname=\"c8\"\u003e \u003cp\u003e% change price per gram (Base cost \u003cspan\u003e$\u003c/span\u003e0.034/g)\u003c/p\u003e \u003c/th\u003e \u003c/tr\u003e \u003c/thead\u003e \u003ctbody\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e1) Base\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e--\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e0%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e2) XG as emulsifier/ viscosity modifier 10% total weight\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e+++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e+++++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e-20%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e3) XG changed to 2% total weight\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e+++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e++++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e30.0 (3.2)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e-50%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e4) GG as emulsifier/viscosity modifier 0.1% total weight\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e-51%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e5) GG changed to 1% total weight\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e-51%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e6) GG changed to 1.2% total weight\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e+++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e++++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e-50%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e7) Siligel as emulsifier/ viscosity modifier 2% total weight\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e67.4 (76)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e-12%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e8) XG \u0026amp; GG 50/50 mixture 1.2% total weight\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e++++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e35.9 (19)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e-49%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e9) Peppermint fragrance, Shea Butter as Skin Conditioning Agent\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e+++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e+++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e22.6 (17.9)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e-54%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e10) Siligel as emulsifier/viscosity modifier 3% total weight\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e+++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e+++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e9%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e11) Witch Hazzel as humectant\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e++++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e39.5 (4.5)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e27%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e12) Changed SLS to Polyglucose as emulsifying agent\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e+++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e+++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e62.7 (52)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e7%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e13) Peg A Dimethicone w/ Polyglucose\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e57.0 (15)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e-1%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e14) Coco Betain w/ Polyglucose\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e66.0 (35)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e41%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e15) Replace Siligel w/ Carbomer\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e+++++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e-\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e30.1 (37)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e-57%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e16) Optiphen as preservative\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e+++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e23.0 (36)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e-1%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e17)\u0026nbsp; Hyaluronic Acid as active\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003e++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e+++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003e++\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003e+\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003e40.5 (7.2)\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e10%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003ctr\u003e \u003ctd align=\"left\" colname=\"c1\"\u003e \u003cp\u003e\u003cb\u003e18) Aloe added as a humectant\u003c/b\u003e\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c2\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c3\"\u003e \u003cp\u003e--\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c4\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c5\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c6\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c7\"\u003e \u003cp\u003eND\u003c/p\u003e \u003c/td\u003e \u003ctd align=\"left\" colname=\"c8\"\u003e \u003cp\u003e41%\u003c/p\u003e \u003c/td\u003e \u003c/tr\u003e \u003c/tbody\u003e \u003c/colgroup\u003e \u003c/table\u003e\u003c/div\u003e \u003c/p\u003e \u003cp\u003e \u003c/p\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003ch2\u003eSeeking Substitutions\u003c/h2\u003e \u003cp\u003e The initial recipe provided as a base recipe contained ingredients that were not immediately available to the formulators. Thus, the first experimental question was to see if ML could produce an adjusted base recipe from the library of ingredients available to the formulators. The originally provided base recipe along with the current library of ingredients was provided in a prompt to ML, and suggestions as replacements for the missing components were sought. ML provided a new adjusted base recipe, and the formula was created. The product was lacking in many ways, but it provided a great boost in productivity, creating a quick place to start the development process.\u003c/p\u003e \u003cp\u003eIt would have to be said that ML was extremely good at this step, and it felt like this was the most empowering result. ML was able to pull together a product that had qualitatively measurable specifications from a relatively limited, although diverse, library of materials. At this point, the formulation process could turn to specific improvements. From the conception of the product to the step of having a product that was ready for revision, it felt like there was a considerable benefit to using ML.\u003c/p\u003e \u003cp\u003eOnce further adjustments were needed to address identified specifications with materials in the current library or with novel materials, the process slowed down. Gains on the product were not always attained when ML was asked to provide a product that had improved specifications. Usually, the new product had a change in the desired specification, which is a positive, but it was not always an improvement. Additionally, the suggestions became less impressive and less exploratory in nature. The suggestions were more predictable to more seasoned formulators.\u003c/p\u003e \u003cp\u003eA comment should be made as well regarding the direction sought from ML as to the amount that should be added to achieve the desired specification. The amount of the substances to be added was rarely provided by ML directly, even if the rest of the recipe was provided. Instead, responses about following manufacture\u0026rsquo;s recommended amounts would result. Further insight could be gained by asking the same question from a different approach, but it was quickly clear that either not enough information was available, or ML did not have the ability to parse all of the necessary variables in the particular situation to provide a reliable amount of a replacement ingredient. Therefore, to complete such a process, multiple rounds were necessary to ascertain if the provided substitution was really not an acceptable substitution, or if its concentration was not correct to get the desired change. This began to feel like the ML was slowing things down. Formulators wanted to apply their personal new ideas before exploring the ML suggestions because the new products that were made were easily characterizable, and smaller gains seemed to be made in this back-and-forth situation.\u003c/p\u003e \u003cp\u003eThe third set of experiments then explored improving the cost of the product by asking ML to provide replacement options for higher cost ingredients. ML seemed confident in its suggestions, however it again quickly felt like ML was out of its element when asked to gather information on pricing. This is not necessarily the fault of ML or a hallucination of ML, but it seemed to be another scenario that there were too many variables to account for that could change too frequently when wanting to identify the price of an ingredient. General information about prices of materials were generally reliable, but specifics were not robust enough to provide reliable information compared to internet searches. Thus, ML suggestions needed to be double checked with internet and vendor catalog searches to verify pricing. The contribution of ML was not easily evident in these price-driven experiments.\u003c/p\u003e \u003cp\u003eThis third set of experiments ended up having the same feeling as the second experiment in which the mass percent of the suggested substitution was always in question. Thus, any lack of specification improvement or maintenance, in the case of attempting a cost reduction attempt, required many iterations to verify. Hence, this did not raise the ability of ML above that of a simple internet search.\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec10\" class=\"Section2\"\u003e \u003ch2\u003eMachine Learning\u003c/h2\u003e \u003cp\u003e ML is always quick to offer many suggestions to adjust cosmetic recipes when provided specific prompts about the type of product being formulated. However, it is important to note that the ability of ML to retain information from previous interactions, even within the same chat session, is limited. This means that while it excels in ideas for quick adjustments based on current input, and often will behave as if it has retained previous information from recent conversations, it is limited in its ability to leverage information from earlier discussions or remember past formulations to make more complex decisions based on previous results. Thus, it relies heavily on the immediate prompts and information provided in the immediate prompt. This makes it less able to contribute intelligently to the usual gradual adjustments in the formulation process.\u003c/p\u003e \u003cp\u003eAdditionally, there seems to be an issue with ML as far as the attention given to the prompts. There is a mentioned limit to prompt size, however, there also seems to be an internal limitation of ML to gather or weight the beginning portion of a longer prompt with the latter portion of the prompt. It seems that Chat GTP, for example, will not always identify that a given prompt exceeds the specified limit, and it will end up addressing only the latter part of the prompt. It is as if the limit on the prompt size starts with the end of the prompt to the limit of words or symbols as it moves back to the beginning of the prompt. This is just an observation from this work and is not referenced in any other work. Thus, the value of this specific finding should be considered accordingly.\u003c/p\u003e \u003cp\u003eWhere ML can excel in its ability to provide quick suggestions for adjusting cosmetic recipes based on given parameters, it is relatively unreliable in terms of mathematical accuracy.\u003csup\u003e\u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e1\u003c/span\u003e\u003c/sup\u003e ML particularly falls short in making decisions about necessary amounts for synthesizing cosmetic products. While it can provide estimations and suggestions, its calculations are frequently incorrect. The suggestions are not necessarily incorrect in terms of meeting desired specifications, but ML is not reliable when asked to determine percent composition or the appropriate mass to add to reach a desired mass percent. Therefore, it is essential for formulators to double-check and validate these calculations to ensure the accuracy and safety of the final product.\u003c/p\u003e \u003cp\u003eML does have access to chemistry knowledge\u003csup\u003e\u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e7\u003c/span\u003e\u003c/sup\u003e and can piece much of the data together to provide simple answers and provide enough feedback to easily trick a novice into thinking it does have the ability to project more complex answers to questions with more variables. However, as mentioned above, provided answers to complex problems are usually incorrect. Unfortunately, sometimes the inaccuracies in the answers require expertise knowledge to identify.\u003c/p\u003e \u003cp\u003eWhile it is challenging to quantify the depth of knowledge and abilities of ML, providing a little context in the provided prompt, such as explaining the project at hand, enables ML to make more connections with its database and thus offer more suggestions on how to proceed. These suggestions can include commonly used chemicals in personal care products. For instance, when presented with the task of enhancing a specific aspect of a base recipe for a face mask, the MLS was able to generate a suggestion that would be helpful to a novice formulator. A large, detailed prompt is required to get the ML to provide the allusion that it understands the issue at hand. The template that was generally used is provided below.\u003cdiv class=\"BlockQuote\"\u003e\u003cp\u003ePROMPT: \u0026ldquo;The following text is the recipe for a gel-based cosmetic face mask:\u003c/p\u003e\u003cp\u003ePurified water: 70.65%\u003c/p\u003e\u003c/div\u003e\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eEDTA: 0.15%\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eGlycerin: 6.00%\u003c/p\u003e \u003cp\u003eButylene Glycol: 3.00%\u003c/p\u003e \u003cp\u003eXanthan gum: 0.6%\u003c/p\u003e \u003cp\u003eGuar Gum: 0.6%\u003c/p\u003e \u003cp\u003eAntiMicrobial Banana mixture: 0.90%\u003c/p\u003e \u003cp\u003eSodium Lauryl Sulfate: 1.00%\u003c/p\u003e \u003cp\u003eMango Butter: 3.00%\u003c/p\u003e \u003cp\u003eOlive Oil: 5.00%\u003c/p\u003e \u003cp\u003eCoconut Oil: 6.00%\u003c/p\u003e \u003cp\u003eVitamin E: 0.50%\u003c/p\u003e \u003cp\u003ePolyglyceryl Oleate: 2.00%\u003c/p\u003e \u003cp\u003ePapaya Banana: 0.10%\u003c/p\u003e \u003cp\u003eProtein-Hyaluronate blend: 0.50%\u003c/p\u003e \u003cp\u003eThe desired yield of this recipe is 50 grams.\u003c/p\u003e \u003cp\u003eFrom this face mask recipe, suggest a substitute for the Xanthan gum in the correct scaled mass amount that will give a creamy consistency.\u0026rdquo;\u003c/p\u003e \u003cp\u003eML RESPONSE: \u0026ldquo;Xanthan gum is often used in cosmetic and skincare formulations to provide a thickening and stabilizing effect. If you're looking for a substitute for xanthan gum that will give your face mask a creamy consistency, you can consider using Guar Gum as a replacement. Guar gum is a natural thickener and stabilizer commonly used in cosmetic products and can be substituted for xanthan gum.\u003c/p\u003e \u003cp\u003eTo determine the correct scaled mass amount of guar gum as a substitute for the 5 grams of xanthan gum, you can use a similar proportion. Here's the calculation:\u003c/p\u003e \u003cp\u003eOriginal Xanthan Gum: 5 g\u003c/p\u003e \u003cp\u003eGuar Gum Substitute: (5 g x Mass of Guar Gum) / Mass of Xanthan Gum\u003c/p\u003e \u003cp\u003eTo calculate the mass of Guar Gum, you can rearrange the equation:\u003c/p\u003e \u003cp\u003eMass of Guar Gum = (5 g x Mass of Guar Gum) / 5 g\u003c/p\u003e \u003cp\u003eMass of Guar Gum\u0026thinsp;=\u0026thinsp;Mass of Guar Gum\u003c/p\u003e \u003cp\u003eSo, you can substitute 5 grams of Guar Gum for the 5 grams of xanthan gum in your recipe to achieve a creamy consistency.\u0026rdquo;\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003cp\u003eAlthough rudimentary, ML successfully answered the question and suggested a reasonable substitute for a thickener. It is also important to note that the formulators relied on a template prompt to achieve these results. However, in this specific case, the product's quality decreased when the two formulas were compared. This decrease in texture quality was because the product was too thick due to the materials not being able to be replaced on a 1:1 gram ratio. This is not difficult to fix but the ML aspect was not more helpful in this case than a simple internet search.\u003c/p\u003e \u003cp\u003eAs mentioned, ML is not proficient in the mathematical calculations expected for formulation creation. ML was unable to make a chemical substitution that maintained the desired mass yield and percentage of the chemical component. For that reason, the formulators generated a code that could readjust the ML suggested mass to the base recipe. This was a necessary intermediate step used to help maintain a consistent mass of product while adhering to the percentage of each component from the base recipe. This is an example of a necessary intervention to ensure that multi-variate ML suggestions were accurate or at least adjusted to be accurately applied.\u003c/p\u003e \u003cp\u003e Regarding ML capabilities in chemical formulation, it was very successful in quickly allowing formulators in training to adjusting a base recipe of a cosmetic formula to achieve a stable starting point from a provided library. A sellable product emerged in just 2 iterations. In the hands of formulators in training ML was partially helpful in enhancing product characteristics, but the requirement for the users to have formulation training, albeit minimal, made the ML contribution more cumbersome. Thus, the efficiency of the second round of experiments decreased quickly from iteration to iteration. This lack of improvement seemed more than just a typical law of diminishing marginal returns scenario but more like an obstacle that was in the ML recommendation that could not be specifically identified (i.e. wrong amount or wrong role of the ingredient for the specification in the prompt).\u003c/p\u003e \u003cp\u003eRegarding the usage of ML to accommodate cost calculations per unit or per batch, formulators did not trust that the suggestions provided by ML were moving the product in a cost-decreasing direction while maintaining the quality of previous iterations. In other words, because the second set of experiments were only marginally helpful after two or three iterations, adding an additional task of maintaining specifications while lowering a quantitative value (the cost) seemed to be too much for ML in the hands of new formulators. The iterations\u0026rsquo; calculated cost did fluctuate, but this fluctuation was close to the magnitude of different vendor\u0026rsquo;s price fluctuation of the materials used, thus nullifying any real gains from ML proposals.\u003c/p\u003e \u003c/div\u003e"},{"header":"Conclusion","content":"\u003cp\u003eML can serve as a valuable aid in helping new formulators lear how to adjusting cosmetic recipes to achieve a high quality product very quickly. The effectiveness of ML contributing to deeper learning of new formulators is still contingent upon the supervision of a trained formulator. While minimally trained formulators can benefit from ML guidance, expert formulators may find the ML suggestions are limited in terms of novel insights due to the limitations of data sources available to the MLS. The level of training and expertise of the formulator significantly influences the extent to which ML contributes effectively to the formulation process. There exists a balance where undertrained individuals can receive valuable assistance, yet seasoned formulators might not find groundbreaking insights, as the ML available database may not be as comprehensive or up to date as their wealth of experience and knowledge.\u003c/p\u003e \u003cp\u003eIn future studies, it is suggested that the experiments be repeated with a larger n value for the consumer testers and use consumer feedback more in the loop of the iterations. However, it may be too early to do this as the feedback may not be as focused as a professional formulator\u0026rsquo;s experience. Another suggestion could be to quantify the qualitative specifications with a ruberic, which will also come with its own set of problems. By assigning each quality a quantitative measurement on a fixed scale, future studies could compose an average overall \u0026ldquo;score\u0026rdquo; for each formulation. However, locking in on such a score could actually lead the formula to a local minimum that is not actually ideal.\u003c/p\u003e \u003cp\u003eBoth of these previous suggestions are provided as being independent of ML. A true high risk proposal would be to see if ML could help quantify the resulting feedback provided by consumers. In such a case, ML would truly be producing the \u0026ldquo;usable\u0026rdquo; data. Although possibly highly rewarding, this could also come with disastrous effects given the track record of ML with quantifiable parameters such as mathematical calculations or cost reduction. As previously mentioned, such an endeavor would require strict adherence to mechanisms that ensure repeatability and adherence to the scientific method\u003csup\u003e\u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e4\u003c/span\u003e\u003c/sup\u003e.\u003c/p\u003e \u003cp\u003eFuture researchers will note the possibility of supporting ML with additional code for a specific recipe or laboratory setting to convert ML responses into usable data. Building these extraneous programs and utilizing them has potential to ensure that working with ML is more efficient and possibly expandable into different areas of formulation.\u003c/p\u003e \u003cp\u003eBased on the results of the research, it can be concluded that ML shows potential in adjusting cosmetic formulations and providing suggestions for chemical substitutions in the hands of novice formulator. ML demonstrates some background knowledge in chemistry, particularly when given context in the chat prompt. However, it lacks a reliable memory within the same chat box and struggles with mathematical calculations for precise adjustments. For new formulators, the iterative process of recipe adjustments utilizing ML yielded mixed results, with varying levels of success in achieving the desired product specifications. Additionally, ML was not considered proficient or significantly helpful in making effective cost adjustments.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003cstrong\u003eStatements and Declarations\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis work was supported by internal funding from Utah Valley University, College of Science.\u0026nbsp;The authors declare that no external funds, grants, or other support were received during the preparation of this manuscript.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting Interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors have no relevant financial or non-financial interests to disclose.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthor Contributions\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eAll authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by all of the authors. The first draft of the manuscript was written by Hazel Ticas, followed by significant changes and contributions by all authors. The final draft was written by Daniel Scott. Subsequently all authors commented on the final versions of the manuscript. All authors read and approved the final manuscript.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eTyson, J. (2023). Shortcomings of ChatGPT. \u003cem\u003eJournal of Chemical Education\u003c/em\u003e, \u003cem\u003e100\u003c/em\u003e(8), 3098\u0026ndash;3101. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1021/acs.jchemed.3c00361\u003c/span\u003e\u003cspan address=\"10.1021/acs.jchemed.3c00361\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eLee, C. K. H., Choy, K. L., \u0026amp; Chan, Y. N. (2014). A knowledge-based ingredient formulation system for chemical product development in the personal care industry. \u003cem\u003eComputers \u0026amp; Chemical Engineering\u003c/em\u003e, \u003cem\u003e65\u003c/em\u003e, 40\u0026ndash;53. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.compchemeng.2014.03.004\u003c/span\u003e\u003cspan address=\"10.1016/j.compchemeng.2014.03.004\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eBannigan, P., Aldeghi, M., Bao, Z., H\u0026auml;se, F., Aspuru-Guzik, A., \u0026amp; Allen, C. (2021). Machine learning directed drug formulation development. \u003cem\u003eAdvanced Drug Delivery Reviews\u003c/em\u003e, \u003cem\u003e175\u003c/em\u003e, 113806. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.addr.2021.05.016\u003c/span\u003e\u003cspan address=\"10.1016/j.addr.2021.05.016\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eArtrith, N., Butler, K. T., Coudert, F. X., et al. (2021). Best practices in machine learning for chemistry. \u003cem\u003eNature Chemistry\u003c/em\u003e, \u003cem\u003e13\u003c/em\u003e(6), 505\u0026ndash;508. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41557-021-00716-z\u003c/span\u003e\u003cspan address=\"10.1038/s41557-021-00716-z\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eMurray, J. D., Lange, J. J., Bennett-Lenane, H., et al. (2023). Advancing algorithmic drug product development: Recommendations for machine learning approaches in drug formulation. \u003cem\u003eEuropean Journal of Pharmaceutical Sciences\u003c/em\u003e, \u003cem\u003e191\u003c/em\u003e, 106562. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.ejps.2023.106562\u003c/span\u003e\u003cspan address=\"10.1016/j.ejps.2023.106562\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eXiao, P., Zhang, X., Pan, W., et al. (2020). The Development of a Skin Image Analysis Tool by Using Machine Learning Algorithms. \u003cem\u003eCosmetics\u003c/em\u003e, \u003cem\u003e7\u003c/em\u003e(3), 67. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/cosmetics7030067\u003c/span\u003e\u003cspan address=\"10.3390/cosmetics7030067\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eShi, Y. F., Yang, Z. X., Ma, S. (2023). Machine Learning for Chemistry: Basics and Applications. \u003cem\u003eEngineering\u003c/em\u003e. Published online July 31, \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.eng.2023.04.013\u003c/span\u003e\u003cspan address=\"10.1016/j.eng.2023.04.013\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003ePaweł Korzyński, Mazurek, G., Krzypkowska, P., \u0026amp; Artur, K. (2023). Artificial intelligence prompt engineering as a new digital competence: Analysis of generative AI technologies such as ChatGPT. \u003cem\u003eEntrepreneurial Business and Economics Review\u003c/em\u003e, \u003cem\u003e11\u003c/em\u003e(3), 25\u0026ndash;37. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.15678/eber.2023.110302\u003c/span\u003e\u003cspan address=\"10.15678/eber.2023.110302\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eAljanabi, M., Yaseen, M. G., Ali, A. H., \u0026amp; Mohammed, M. A. (2023). Prompt Engineering: Guiding the Way to Effective Large Language Models. \u003cem\u003eIraqi Journal for Computer Science and Mathematics\u003c/em\u003e, \u003cem\u003e4\u003c/em\u003e(4), 151\u0026ndash;155. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.52866/ijcsm.2023.04.04.012\u003c/span\u003e\u003cspan address=\"10.52866/ijcsm.2023.04.04.012\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eOpenAI. Introducing ChatGPT. OpenAI Blog (2022). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://openai.com/blog/chatgpt\u003c/span\u003e\u003cspan address=\"https://openai.com/blog/chatgpt\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eO\u0026rsquo;Sullivan, R. (2023). What are ChatGPT\u0026rsquo;s limits? Namecheap. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://www.namecheap.com/blog/what-are-chatgpts-limits/\u003c/span\u003e\u003cspan address=\"https://www.namecheap.com/blog/what-are-chatgpts-limits/\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eFlensted, T. (2023). How Many Languages Does ChatGPT Support? The Complete ChatGPT Language List. SEO.AI. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://seo.ai/bprlog/how-many-languages-does-chatgpt-support\u003c/span\u003e\u003cspan address=\"https://seo.ai/bprlog/how-many-languages-does-chatgpt-support\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e \u003cli\u003e\u003cspan\u003eCastro Nascimento, C. M., \u0026amp; Silva Pimentel, A. (2023). Do Large Language Models Understand Chemistry? A Conversation with ChatGPT. \u003cem\u003eJournal of Chemical Information and Modeling\u003c/em\u003e, \u003cem\u003e63\u003c/em\u003e(6), 1649\u0026ndash;1655. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.1021/acs.jcim.3c00285\u003c/span\u003e\u003cspan address=\"10.1021/acs.jcim.3c00285\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Machine Learning, ChatGPT, Cosmetics, Formulation, novice formulators, face mask","lastPublishedDoi":"10.21203/rs.3.rs-3867796/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-3867796/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThis work explores the application of the questionable use of machine learning (ML), specifically the ChatGPT 3.5 system, in the training of student formulators. Here, an experiment is undertaken to explore the ability of ML to aid in training students in the role of formulators of a personal care product. The focus is on whether or not the students can successfully rely on ML to guide them through the formulation process of a 10-minute hydrating face mask recipe. While exploring the iterative process of recipe adjustments with ML, it was found that the language model demonstrates the ability to help formulators in training due to its above-average knowledge in chemistry, but when given clear prompts, it performs much better at providing helpful suggestions for ingredient substitutions. However, ML lacks a reliable memory, even within a single extended conversation and struggles with mathematical calculations. ML is not found to be proficient in accurately calculating cost adjustments. Additionally, the contribution of ML may only be marginally helpful in the training of more seasoned formulator. Despite its limitations, ML can quickly and effectively, in the hands of student formulators in-training, provide direction and support to produce and improve upon a base formula resulting in a quality product.\u003c/p\u003e \u003cp\u003e \u003c/p\u003e","manuscriptTitle":"Utilizing Publicly Accessible Machine Learning for Training Student Formulators in Personal Care Product Formulation: Specification-Driven and Cost-Conscious Experiments","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-01-18 11:52:38","doi":"10.21203/rs.3.rs-3867796/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"
[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"e7153efe-f38f-4b90-b68a-29ae4bf808dc","owner":[],"postedDate":"January 18th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2024-03-29T05:15:24+00:00","versionOfRecord":[],"versionCreatedAt":"2024-01-18 11:52:38","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-3867796","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-3867796","identity":"rs-3867796","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}
Text is read by the "Ask this paper" AI Q&A widget below.
Extraction quality varies by source — PMC NXML preserves structure
cleanly, OA-HTML may include some navigation residue, and OA-PDF can
have broken hyphenation. The publisher copy
(via DOI)
is the canonical version.