LLM Data Needs

This template guides you through identifying data needs for your Large Language Model (LLM).

This template guides you through identifying data needs for your Large Language Model (LLM) and identifying how AI might guide you to achieve your goals.

Materials:

  1. This template
  2. Markers or pens
  3. Sticky notes

Time: 60 minutes

This template guides you through identifying data needs for your Large Language Model (LLM) and identifying how AI might guide you to achieve your goals.

Materials:

  1. This template
  2. Markers or pens
  3. Sticky notes

Time: 60 minutes

This template guides you through identifying data needs for your Large Language Model (LLM) and identifying how AI might guide you to achieve your goals.

Materials:

  1. This template
  2. Markers or pens
  3. Sticky notes

Time: 60 minutes


Steps:

Part 1: Open Ideation

  1. Brainstorm ideas about what you want your AI solution to achieve
  2. Vote on the tool that you want to develop.

Part 2: Identifying Data Needs

  1. Analyse your data needs, differentiating between data that you have, want and need and where you can source this data.

Public Data:

  1. Identify existing data: List all publicly available data sources you currently use (e.g., news articles, open-source code).
  2. Needed data: Specify additional publicly available data that would enhance your LLM (e.g., specific datasets, research papers, domain-specific knowledge bases).
  3. Unavailable data: Note any publicly available data that is relevant but inaccessible (e.g., paywalled data, restricted datasets).

Private Data:

  1. Identify existing data: List all private data you currently use (e.g., user logs, internal documents, customer feedback).
  2. Needed data: Specify additional private data that would improve your LLM (e.g., anonymised user interactions, clickstream data, sentiment analysis).
  3. Unavailable data: Note any private data that is relevant but unavailable due to privacy concerns or technical limitations.

User Data:

  1. Identify existing data: List all user data you currently use (e.g., search queries, preferences, interactions with the AI tool).
  2. Needed data: Specify additional user data that would personalise the LLM's responses (e.g., user profiles, demographics, feedback on specific outputs).
  3. Unavailable data: Note any user data that is relevant but unavailable due to privacy regulations or user consent limitations.


Part 3: How Might AI

1. Use the sentence structure proposed to guide you in writing solution statements and fill in the blanks for the following questions:

  • What can the AI achieve?
  • What data is critical to exploit?
  • What process needs to happen?
Template material
Template material

LLM Data Needs

Template material

LLM Data Needs

This template guides you through identifying data needs for your Large Language Model (LLM) and identifying how AI might guide you to achieve your goals.

Materials:

  1. This template
  2. Markers or pens
  3. Sticky notes

Time: 60 minutes

This template guides you through identifying data needs for your Large Language Model (LLM) and identifying how AI might guide you to achieve your goals.

Materials:

  1. This template
  2. Markers or pens
  3. Sticky notes

Time: 60 minutes


Steps:

Part 1: Open Ideation

  1. Brainstorm ideas about what you want your AI solution to achieve
  2. Vote on the tool that you want to develop.

Part 2: Identifying Data Needs

  1. Analyse your data needs, differentiating between data that you have, want and need and where you can source this data.

Public Data:

  1. Identify existing data: List all publicly available data sources you currently use (e.g., news articles, open-source code).
  2. Needed data: Specify additional publicly available data that would enhance your LLM (e.g., specific datasets, research papers, domain-specific knowledge bases).
  3. Unavailable data: Note any publicly available data that is relevant but inaccessible (e.g., paywalled data, restricted datasets).

Private Data:

  1. Identify existing data: List all private data you currently use (e.g., user logs, internal documents, customer feedback).
  2. Needed data: Specify additional private data that would improve your LLM (e.g., anonymised user interactions, clickstream data, sentiment analysis).
  3. Unavailable data: Note any private data that is relevant but unavailable due to privacy concerns or technical limitations.

User Data:

  1. Identify existing data: List all user data you currently use (e.g., search queries, preferences, interactions with the AI tool).
  2. Needed data: Specify additional user data that would personalise the LLM's responses (e.g., user profiles, demographics, feedback on specific outputs).
  3. Unavailable data: Note any user data that is relevant but unavailable due to privacy regulations or user consent limitations.


Part 3: How Might AI

1. Use the sentence structure proposed to guide you in writing solution statements and fill in the blanks for the following questions:

  • What can the AI achieve?
  • What data is critical to exploit?
  • What process needs to happen?
Template material
Template material

LLM Data Needs

Template material

LLM Data Needs