status: 06/2023
These days, ChatGPT is the talk of the town. AI enthusiasts and critical people alike are equally willing to engage with it. Not only does AI serve as an almost unlimited source of knowledge, but also as a source for translations and even terminology. Not to mention the possibility of generating texts of any nature. Nevertheless, there is still harsh criticism concerning the quality of the results and – not surprisingly – more and more philosophic questions arise concerning the impact AI has on our daily lives and on mankind in general.
Particularly in the eyes of a linguist – a group of users particularly sensitive to issues related to unconcise and imprecise language – a machine-generated output must be critically examined. Whereas the results may really astonish at first sight, a closer look reveals weaknesses.
Stress test AI with terminology
Terminology is a very suitable example to perform a stress test on ChatGPT: Specific terms cannot be simply “translated” into a target language, but require careful research on the topic of the concept where context matters more than anything else. The result of a term research is an “equivalent” in the target language. Depending on the culture of the target language, a concept may in the worst case even not (yet) exist, or there are fuzzy edges or overlaps depending on the context. For instance, when comparing the concepts and definitions for “transmission” versus “gearbox” in British vs. US English, there is not only a certain overlap due to fuzzy edges, but also a difference in usage depending on the country. Similarly, “Antrieb” and “Getriebe” in DE must be differentiated in operating instructions.
Now, let us dive into ChatGPT for terminology research
When asking ChatGPT to translate DE “Blitzstossspannung” into English, the result is: “lightning impulse voltage” which is correct in the context of High Voltage. However, once the complexity of the compound words rises, ChatGPT encounters difficulties:
- DE: Blitzstossspannung
EN: lightning impulse voltage
↪ Correct. Source: IEC 60060-1 “Blitzstossspannung/lightning-impulse voltage”
- DE: Stehblitzstossspannung (1st try)
EN: withstand impulse voltage
↪ False. “lightning” is missing
- DE: Stehblitzstossspannung (2nd try)
EN: withstand impulse voltage
↪ False. “lightning” is missing. Correct would be: “lightning impulse withstand voltage”
- DE: Bemessungssteh-Blitzstossspannung (1st try, with hyphen)
EN: rated impulse voltage
↪ False. DE part “steh-“ is and “withstand” is missing in the answer
- DE: Bemessungsstehblitzstossspannung (2nd try, without hyphen)
EN: rated withstand impulse voltag
↪ False. “lightning” is still missing.
- DE: Bemessungssteh-Blitzstossspannung (3rd try, with hyphen)
EN: rated withstand impulse voltage
↪ False. “lightning” is missing. Correct would be: “rated lightning impulse withstand voltage”
Without having given feedback to the machine, the results seem to differ with the number of queries, but only in certain cases. However, after having given feedback, the machine is already taught for the next query:
The main problem, whatsoever, is the missing source and context in the result
Source and context information is crucial not only for correct source texts, but particularly for translations of specialized texts. Standardization of language is not only the key to concise texts and fewer misunderstandings, but it also brings along substantial savings due to text re-use both for generation as well as translation of texts. As larger terminology databases such as electropedia, DIN, IEEE, ANSI, Eurotembank and others are accessible free of charge, one would expect the source of a term to emerge in the ChatGPT results without asking for it.
Even when giving details in the query, the result is not exactly as expected:
Example 1
What is the correct Spanish equivalent for circuit-breaker according to IEC 441-14-20?
Question
According to IEC 441-14-20, the correct Spanish equivalent for circuit-breaker is "interruptor automático de circuito".
Answer by ChatGPT
Remarks: Not correct. IEC 441-14-20 says: interruptor automático (mecánico)
Example 2
Give me the arabic equivalent for "circuit-breaker" according to IEC 441-14-20.
Question
According to IEC 441-14-20, the Arabic equivalent for "circuit-breaker is مفتاح القدرة الكهربائية
Answer by ChatGPT
Remarks: Not correct IEC 441-14-20 says: قاطع دارة
This means that despite of the given IEC standard in the query, the result does not seem to stem from this specific standard.
Automatic term extraction
Time-consuming term research may be facilitated by an automatic term extraction. ChatGPT deals with it in quite a satisfactory manner. Let us take a text from the following source and asks the system to perform a term extraction: https://product.tdk.com/en/techlibrary/developing/bio-sensor/xmr_sensor.html
The result shows a list of term candidates which is quite appropriate:
At this stage – where we are already quite amazed – we are beginning to understand the similarity to coding. In other words: The machine does only what it is told to do. When asking for a term extraction including sources of the terms, the result improves:
However, when the query includes not only the source, but also the respective definition of the terms (concepts), ChatGPT fails:
Is this failure a question of the query phrasing, or is there just too much information to process? When removing “standards” from the query, ChatGPT comes up with a list of term candidates with their definitions, but without context:
AI detecting context?
This leads us to the question if AI is able to detect context itself:
Test 1
What does character \s mean in Python?
Question
In Python, the character sequence "\\s" represents a whitespace character, such as a space or a tab. It is often used in regular expressions to match any whitespace character
Answer by ChatGPT
Remarks: The context is “guessed” to differ on whether “s” is on Python- or RegEx-level
Test 2
What does character "s" stand for in Python?
Question
In Python, the character "s" is not a special character or reserved keyword. It is simply a lowercase letter that can be used as a variable name or as part of a string. However, "s" is often used as a convention to indicate that a variable or function is related to a string data type. For example, a variable named "name" might be changed to "names" if it contains multiple strings.
Answer by ChatGPT
Remarks: Correct. All possibilities are listed
Test 3
What does RegEx character "s" stand for in Python?
Question
In Python's RegEx, the character ""s"" (without the quotes) stands for any whitespace character, including spaces, tabs, and line breaks.
Answer by ChatGPT
Remarks: Correct, as the context is given in the question
To put it in a nutshell: The performance of ChatGPT is mindblowing for language processing in the most common areas of application. Nevertheless, the more precise a result is expected to be, the more critical we have to eye the output.
As we live in the age of fake news, scamming, phishing and stolen identities, we would do well to on the one hand to identify the source of the information and evaluate the accuracy of the query result.
On the other hand, a substantial danger is the short-sighted and careless use of AI when entering sensitive data which will end up in any sort of corpora in the web which are out of our control.
It is evident that media competence will gain further importance which is a challenge to all users, but it will also lead to a societal divide between the ones who have access to AI sources and the ones who have not.
Will the vision of the trans- and post-humanists of Silicon Valley come true that mankind will have eventually outsourced their brains, skills and capacities? We are not there yet. Emotions will hardly become part of a machine which will never be a biological organism. We absolutely must focus on our natural intelligence which offers critical thinking, context recognition, cognitive abilities and emotions. Technological progress cannot be reversed – but the decision whether and to what degree we want to depend on devices and Artificial intelligence is still ours!
Über die Autorin
Irene B. Fischer
Language Technologist, Translation / Terminology Expert, Requirements Engineer and Data Manager with a long-term experience in Electrical and Mechanical Engineering and an expertise in machine software localization. In the framework of her CAS "Translation Technology and AI" she developed Python scripts to automatically check and correct software texts on technical and linguistic errors.
This article has not been generated by ChatGPT nor any other AI-based tool. ChatGPT was used solely for the examples.