During a recent BIDS seminar featuring Tristan Chambers -- CLEAN Data Architect -- the topic "Unpacking police violence and misconduct records: Solving information extraction challenges using Large Language Models (LLMs)" was addressed. Tristan gave attendees an introduction to LLMs, described some of the problems they can solve, and showed examples of how he uses them for his work on large documents.
He stressed that maintaining traceability is a primary goal and has been developing his own methods for ensuring quality results for his work on 750,000 pages of police misconduct and use-of-force documentation. (The number is expected to grow to 1.5M pages within a few years.) He also touched on the most recent release of GPT-4 Turbo which can handle significantly more data than previous versions. He wrapped it up by discussing security and privacy concerns, some of which are now covered by updated privacy clauses and others which we have not yet even imagined, like the “prompt injection” which has no known solution.