Catalog
concept#AI#Machine Learning#Deep Learning#Natural Language Processing

Large Language Model (LLM)

A large language model is an AI model based on the processing and generation of natural language.

Large language models use deep learning to learn from extensive text data and generate human-like text.
Established
Medium

Classification

  • Medium
  • Technical
  • Design
  • Advanced

Technical context

API for external applicationsDatabases for training dataUser interfaces for interactions

Principles & goals

UnderstandabilityFlexibilityScalability
Build
Enterprise, Domain

Use cases & scenarios

Compromises

  • Abuse for generating misinformation
  • Dependency on technology
  • Lack of transparency in decision-making processes
  • Regular model review
  • Use of transfer learning
  • Documentation of results

I/O & resources

  • Training data
  • Model architecture
  • Hyperparameters
  • Predictions
  • Generated texts
  • Analyses

Description

Large language models use deep learning to learn from extensive text data and generate human-like text. They are capable of understanding contexts and generating relevant responses, making them useful in many applications.

  • Increased efficiency in text generation
  • Improved user interaction
  • Diverse application possibilities

  • Can inherit biases from training data
  • Requires large amounts of data for effective training
  • Can provide inaccurate answers in certain contexts

  • Accuracy

    The percentage of correct predictions made by the model.

  • Processing Time

    The time taken by the model to generate a response.

  • User Satisfaction

    The level of satisfaction of users with the generated results.

GPT-3 by OpenAI

A powerful language model capable of generating human-like text and used in various applications.

BERT by Google

A model designed for natural language processing tasks, including text classification and question answering.

T5 by Google

A model that can transform text into various formats, including translation and summarization.

1

Collect and preprocess data

2

Select model architecture

3

Train and evaluate the model

⚠️ Technical debt & bottlenecks

  • Insufficient documentation
  • Outdated training data
  • Lack of model maintenance
Data qualityComputational resourcesModel complexity
  • Using the model to create fake news
  • Abuse of user data without consent
  • Insufficient review of generated content
  • Assuming the model is always correct
  • Neglecting ethical implications
  • Over-reliance on automated systems
Knowledge in machine learningProgramming (e.g., Python)Data analysis
Technological advances in machine learningIncreasing availability of dataGrowing demand for AI applications
  • Compliance with data protection regulations
  • Technological infrastructure
  • Availability of skilled personnel