diff --git a/ai-llm-use-disclosure.tex b/ai-llm-use-disclosure.tex new file mode 100644 index 0000000..be8d953 --- /dev/null +++ b/ai-llm-use-disclosure.tex @@ -0,0 +1,56 @@ + +\chapter*{Use of Artificial Intelligence in This Thesis} + +This thesis has been written during the years of 2020 - 2025. In this time, Artificial Intelligence (AI) technology +including Large Language Models (LLMs) has entered widespread adoption. I have used such LLM systems in the preparation +of this thesis. At the time this thesis was written, LLMs were a powerful and useful technology, but often produced +wrong output. Thus, I used the following list of observations to guide my LLM use during the writing of this thesis. + +\begin{enumerate} + \item Passing text through an LLM is an imprecise operation. Especially when large amounts of text are passed + through an LLM, despite clear instructions such as ``only fix spelling errors'', the LLM output might deviate + from the source text. Therefore, the document text should never be passed through the LLM, and the LLM should be + prompted to point out problems, or to produce a list of suggestions for improvements instead. + \item LLMs are really bad at summarizing text that contains novel concepts. LLM summaries of text often converge to + a re-stating of the general consensus on the text's main topic. Where the source text deviates from conventionla + wisdom or makes novel points, an LLM summary will likely mis-represent those conclusions. Additionally, LLMs are + bad at capturing the point of a text. Unless extreme care is taken when prompting, it is easy to lead an LLM to + produce an inaccurate summary of a text that agrees with the prompt, but misses the gist of the text. Therefore, + extreme caution should be applied when using an LLM for summarization, and LLM output should be checked + diligently in such instances. + \item LLMs are bad at generating text from scratch. Especially on topics of academic interest that are novel and + that do not have well-known answers that can be found in the training corpus for these models, in general they + will not produce useful text when prompted. Therefore, LLMs should never be used to generate novel text. + \item LLMs are really bad at giving references. Prompts that ask for academic references on a topic are likely to + produce non-existing ``hallucinated'' references. The existing references an LLM is most likely to dig up + usually occur on the first page of a web search on the topic, or are frequently cited in literature on the + topic. Thus, LLMs should never be directly queried for references. When researching a new concept, a better use + of an LLM is the generation of query strings for search engines like Google Scholar. +\end{enumerate} + +Applying these observations, I never copied text from the LLM into this thesis. Where I edited the text of this thesis +using suggestions from LLM output, I critically evaluated the LLM output and carefully considered each edit. Instances +of use of LLMs in the writing of this thesis fall into the following categories. + +\paragraph{For checking spelling and grammar,} the LLM was prompted with an instruction to review the text and output a +list of errors. The list was then reviewed and the errors were fixed in the source document by hand. An example prompt +for the LLM in this case might be: ``The attached file contains the LaTeX source code of a chapter of an doctoral thesis +titled `...'. Review the text and list any mistakes in spelling or grammar.'' + +\paragraph{For improving formulation patterns,} the LLM was prompted with a short excerpt of text of at most two +paragraphs and instructions asking for an improved version of the text. In response to such a prompt, the LLM will often +change the meaning of parts of the text. Thus, I used the output as a reference example, and manually adjusted the +source document applying parts of the LLM response where fitting. An example prompt in this case might be: ``The +following text are two paragraphs from a chapter on `...' in a PhD thesis on `...' . Improve the wording of these +paragraphs to make them easier to read and understand.''. + +\paragraph{For improving the structure of the text,} the LLM was prompted with an instruction to review the text and +output a list of recommendations. The list was then reviewed, and changes were made to the source document by hand. An +example prompt in this case might be: ``The attached document contains the LaTeX source code of a chapter of a PhD +thesis on `...' . Critically assess the structure and organization of the chapter and write a list of suggestions for +improvement.'' + +In accordance with the recommendations of the University and State Library Darmstadt regarding the labelling and +documentation of AI-generated materials dated September 22 2025, instances where I used an LLM to edit parts of the text +of this thesis as described above have not been explicitly labelled in the text. The LLM in this use assumes a similar +role a human editor might assume reviewing the text. diff --git a/thesis.tex b/thesis.tex index cf7b036..d9fd00c 100644 --- a/thesis.tex +++ b/thesis.tex @@ -34,6 +34,7 @@ \listoffigures \listoftables +\input{ai-llm-use-disclosure.tex} \dochapter{chapter-introduction} % Status: In pretty good shape \dochapter{chapter-epa} % Status: In pretty good shape \dochapter{chapter-ihsm} % Status: Copy-paste done, build works, integration TODO