There is much interest in the new generation Artificial Intelligence (AI) chatbots, including ChatGPT. In part, such platforms are designed to mimic or reproduce human “conversations” including scientific discussion, with functionality stretching to writing and debugging computer code.
Without a doubt these new platforms will change the scientific landscape, but will scientists and researchers become redundant? Will future scientists need the same rigorous training that we received? Should we embrace this advancement?
Certainly, ChatGPT incorporates both supervised and reinforced learning, with performance improved through a human feedback component. But is it possible that most of us can shortcut processes and rely on ChatGPT? Do we really need to know the ins and outs of implementations and the consequences of inconsistencies? Speaking of the consequences of inconsistencies:
In 1628 the Swedish vessel, Vasa, sunk 2km into its inaugural voyage. Archaeologist Fred Hocker suggests that the ship was asymmetric with a bias to the port. He also found four distinct carpenter’s rulers, some with gradations matching the ‘Swedish foot’ and others the ‘Amsterdam foot’ which differed in length by an inch [1]. Interestingly, rigorous testing (30 sailors running from port to starboard and back) had been terminated due to the risk of capsizing [2].
You think we would have learned, but simultaneous use of both metric and imperial measurements resulted in the under fuelling and crash landing of the 1983 Air Canada Boeing 767 flying from Ottawa to Edmonton [3], and the likely crash landing of the Lockheed Martin and NASA/JPL Mars Climate Orbiter [4].
Furthermore, a decimal point in the wrong place resulted in the 2003, Spanish billion$ S-80 submarine program being scrapped [5]. While rounding errors have also proved fatal with the explosion of the 1996, European Space Agency’s Ariane 5 rocket 37 seconds into the flight [6].
But what of ChatGPT and our research, and can we take ChatGPT at face value or do we need to understand enough to check its response?
Recently a colleague asked ChatGPT to generate generic Matlab code to implement the Wright-Fisher model. Below, I have translated the Matlab code to pseudo-code to make it easier to follow, see what you think. Can we trust ChatGPT’s response? You might like to focus on the underlined text.
……………………………………….
Wright-Fisher model simulation
Define parameters: population size N; number of generations, tMax; initial frequency of allele A, p;
Initialize POP, an Nx2 binary array with row sum 1, recording assignment of allele A or B to N individuals.
Simulate evolution through a for loop of length tMax
Sum columns of POP and divide by 2*N to calculate frequency fA of allele A and fB of allele B
Simulate reproduction and genetic drift
Initialize OFFSPRING, an Nx2 a binary array with each cell containing a random entry chosen from POP
Update POP equal to OFFSPRING
end for loop
Print fA and fB
……………………………………….
While there are obvious shortcomings, many scientists — such as the Fields medal winner Terrance Tao [7] — see value in AI tools in providing approximations to solutions for mathematical problems which when combined with traditional search, aid in the identification of correct solutions.
How these tools will influence research in the future remains to be seen as we are still so early in their development. A cautious approach should be taken and a discerning eye used when reviewing responses to help prevent the consequences of inconsistencies.
Diane Donovan
Chief Investigator, The University of Queensland