Docker Model Runner で Gemma 3 を実行する: 完全にローカルな GenAI 開発者エクスペリエンス

ジェネレーティブAI開発の状況は急速に進化していますが、大きな課題が伴います。APIの使用コストは、特に開発中にすぐに積み重なる可能性があります。プライバシーに関する懸念は、機密データを外部サービスに送信する必要がある場合に発生します。また、外部 API に依存すると、接続の問題やレイテンシが発生する可能性があります。

ジェマ・ 3 を入力してください Docker Model Runner は、最先端の言語モデルをローカル環境に導入し、これらの課題に正面から取り組む強力な組み合わせです。

このブログ記事では、Docker Model Runnerを使用して Gemma 3 をローカルで実行する方法について説明します。また、Jarvis という架空の AI アシスタントに関するユーザーフィードバックを分析するコメント処理システムという、実用的なケーススタディについても説明します。

ローカルなGenAI開発の力

実装に入る前に、ローカルなGenAI開発がますます重要になっている理由を見てみましょう。

コスト効率:トークンごとまたはリクエストごとの料金がかからないため、使用料を気にせずに自由に実験できます。
データのプライバシー: 機密データは環境内に留まり、第三者に公開されることはありません。
ネットワーク遅延の短縮: 外部 API への依存を排除し、オフラインでの使用を可能にします。
フルコントロール:仲介者を介さず、完全な透明性を確保した、自分の条件でモデルを実行します。

Gemma 3 を使用した Docker Model Runner のセットアップ

Docker Model Runner は、モデルをローカルで実行するための OpenAI 互換の API インターフェイスを提供します。
これは、バージョン 4.40.0以降の Docker Desktop for macOS に含まれています。

Gemma 3で設定する方法は次のとおりです。

docker desktop enable model-runner --tcp 12434
docker model pull ai/gemma3

セットアップが完了すると、Model Runnerが提供するOpenAI互換APIは、http://localhost:12434/engines/v1で利用できます。

ケーススタディ:コメント処理システム

ローカルの GenAI 開発の力を実証するために、複数の NLP タスクに Gemma 3 を活用するコメント処理システムを構築しました。このシステムは、次のことを行います。

架空のAIアシスタントに関する合成ユーザーコメントを生成します
コメントを肯定的、否定的、または中立的に分類します
埋め込みを使用して類似したコメントをクラスター化します
コメントから潜在的な製品機能を特定します
コンテキストに応じて適切な応答を生成します

すべてのタスクは、外部API呼び出しなしでローカルに実行されます。

実装の詳細

ローカルモデルを使用するためのOpenAI SDKの構成

これを機能させるために、OpenAI SDK を Docker Model Runner を指すように構成します。

// config.js
 
export default {
  // Model configuration
  openai: {
    baseURL: "http://localhost:12434/engines/v1", // Base URL for Docker Model Runner
    apiKey: 'ignored',
    model: "ai/gemma3",
    commentGeneration: { // Each task has its own configuration, for example temperature is set to a high value when generating comments for creativity
      temperature: 0.3, 
      max_tokens: 250,
      n: 1,
    },
    embedding: {
      model: "ai/mxbai-embed-large", // Model for generating embeddings
    },
  },
  // ... other configuration options
};

import OpenAI from 'openai';
import config from './config.js';
 
// Initialize OpenAI client with local endpoint
const client = new OpenAI({
  baseURL: config.openai.baseURL,
  apiKey: config.openai.apiKey,
});

タスク固有の構成

モデルをローカルで実行する主な利点の 1 つは、API のコストやレート制限を気にすることなく、タスクごとに異なる構成を自由に試すことができることです。

私たちの場合:

合成コメント生成では、創造性のためにより高い温度を使用します。
分類では、一貫性を保つために、より低い温度と 10トークンの制限が使用されます。
クラスタリングでは、最大 20 個のトークンを使用して、埋め込みのセマンティックの豊富さを向上させることができます。

この柔軟性により、迅速な反復処理、パフォーマンスの調整、各ユースケースに合わせたモデルの動作の調整が可能になります。

合成コメントの生成

ユーザーからのフィードバックをシミュレートするために、Gemma 3の機能を使用して、詳細なコンテキスト対応のプロンプトに従います。

/**
 * Create a prompt for comment generation
 * @param {string} type - Type of comment (positive, negative, neutral)
 * @param {string} topic - Topic of the comment
 * @returns {string} - Prompt for OpenAI
 */
function createPromptForCommentGeneration(type, topic) {
  let sentiment = '';
   
  switch (type) {
    case 'positive':
      sentiment = 'positive and appreciative';
      break;
    case 'negative':
      sentiment = 'negative and critical';
      break;
    case 'neutral':
      sentiment = 'neutral and balanced';
      break;
    default:
      sentiment = 'general';
  }
   
  return `Generate a realistic ${sentiment} user comment about an AI assistant called Jarvis, focusing on its ${topic}.
   
The comment should sound natural, as if written by a real user who has been using Jarvis.
Keep the comment concise (1-3 sentences) and focused on the specific topic.
Do not include ratings (like "5/5 stars") or formatting.
Just return the comment text without any additional context or explanation.`;
}

例：

"Honestly, Jarvis is just a lot of empty promises. It keeps suggesting irrelevant articles and failing to actually understand my requests for help with my work – it’s not helpful at all."
 
"Jarvis is seriously impressive – the speed at which it responds is incredible! I’ve never used an AI assistant that’s so quick and efficient, it’s a game changer.

オンデマンドで現実的なフィードバックを生成する機能は、API コストなしでユーザーデータをシミュレートするのに非常に役立ちます。

コンテキストに応じたレスポンスの生成

また、Gemma 3 を使用して、ユーザーのコメントに対する丁寧でブランドに合ったサポートの応答をシミュレートしています。プロンプトロジックは次のとおりです。

const response = await client.chat.completions.create({
    model: config.openai.model,
    messages: [
      {
        role: "system",
        content: `You are a customer support representative for an AI assistant called Jarvis. Your task is to generate polite, helpful responses to user comments.
 
Guidelines for responses:
1. Be empathetic and acknowledge the user's feedback
2. Thank the user for their input
3. If the comment is positive, express appreciation
4. If the comment is negative, apologize for the inconvenience and assure them you're working on improvements
5. If the comment is neutral, acknowledge their observation
6. If relevant, mention that their feedback will be considered for future updates
7. Keep responses concise (2-4 sentences) and professional
8. Do not make specific promises about feature implementation or timelines
9. Sign the response as "The Jarvis Team"`
      },
      {
        role: "user",
        content: `User comment: "${comment.text}"
Comment category: ${comment.category || 'unknown'}
 
${featuresContext}
 
Generate a polite, helpful response to this user comment.`
      }
    ],
    temperature: 0.7,
    max_tokens: 200
  });

例：

肯定的なコメントの場合:

Thank you so much for your positive feedback regarding Jarvis’s interface! We’re thrilled to hear you find it clean and intuitive – that’s exactly what we’re aiming for. We appreciate you pointing out your desire for more visual customization options, and your feedback will definitely be considered as we continue to develop Jarvis.
 
The Jarvis Team

否定的なコメントの場合:

Thank you for your feedback – we appreciate you taking the time to share your observations about Jarvis. We sincerely apologize for the glitches and freezes you’ve experienced; we understand how frustrating that can be. Your input is valuable, and we’re actively working on improvements to enhance Jarvis’s reliability and accuracy. 
 
The Jarvis Team

このアプローチにより、完全にローカルで生成された、一貫性のある人間のようなサポート体験が保証されます。

ユーザーフィードバックからの製品機能の抽出

コメントの生成と返信だけでなく、Gemma 3 を使用してユーザーフィードバックを分析し、実用的な洞察を特定します。これにより、プロダクトアナリストの役割をシミュレートし、繰り返し発生するテーマ、ユーザーの問題点、改善の機会を浮かび上がらせます。

ここでは、一連のユーザーコメントに基づいて最大 3 つの潜在的な機能または改善点を特定するようにモデルに指示するプロンプトを提供します。

/**
 * Extract features from comments
 * @param {string} commentsText - Text of comments
 * @returns {Promise<Array>} - Array of identified features
 */
async function extractFeaturesFromComments(commentsText) {
  const response = await client.chat.completions.create({
    model: config.openai.model,
    messages: [
      {
        role: "system",
        content: `You are a product analyst for an AI assistant called Jarvis. Your task is to identify potential product features or improvements based on user comments.
         
For each set of comments, identify up to 3 potential features or improvements that could address the user feedback.
 
For each feature, provide:
1. A short name (2-5 words)
2. A brief description (1-2 sentences)
3. The type of feature (New Feature, Improvement, Bug Fix)
4. Priority (High, Medium, Low)
 
Format your response as a JSON array of features, with each feature having the fields: name, description, type, and priority.`
      },
      {
        role: "user",
        content: `Here are some user comments about Jarvis. Identify potential features or improvements based on these comments:
 
${commentsText}`
      }
    ],
    response_format: { type: "json_object" },
    temperature: 0.5
  });
   
  try {
    const result = JSON.parse(response.choices[0].message.content);
    return result.features || [];
  } catch (error) {
    console.error('Error parsing feature identification response:', error);
    return [];
  }
}

モデルが返す内容の例を次に示します。

"features": [
    {
      "name": "Enhanced Visual Customization",
      "description": "Allows users to personalize the Jarvis interface with more themes, icon styles, and display options to improve visual appeal and user preference.",
      "type": "Improvement",
      "priority": "Medium",
      "clusters": [
        "1"
      ]
    },

また、このプロジェクトの他のすべてと同様に、 外部サービスなしでローカルに生成されます。

結論

Gemma 3 と Docker Model Runner を組み合わせることで、高速でプライベート、かつコスト効率が高く、完全に制御できるローカルの GenAI ワークフローが実現しました。コメント処理システムを構築するにあたり、このアプローチの利点を直接体験しました。

APIのコストやレート制限を気にしない迅速なイテレーション
タスクごとに異なる構成をテストできる柔軟性
外部サービスに依存しないオフライン開発
開発中の大幅なコスト削減

そして、これは可能性の一例にすぎません。新しいAI製品のプロトタイプ作成、社内ツールの構築、高度なNLPユースケースの検討など、どのような場合でも、モデルをローカルで実行することで、運転席に立つことができます。

オープンソースのモデルとローカルツールが進化し続けるにつれて、強力なAIシステムを構築するための参入障壁は低くなっています。

AIを消費するだけではありません。プロセスを開発し、形作り、所有します。

自分で試してみてください: リポジトリをクローンして、今日から実験を始めましょう。

AI/ML, Docker

Docker Model Runner で Gemma 3 を実行する: 完全にローカルな GenAI 開発者エクスペリエンス

ローカルなGenAI開発の力

Gemma 3 を使用した Docker Model Runner のセットアップ

ケーススタディ:コメント処理システム

実装の詳細

ローカルモデルを使用するためのOpenAI SDKの構成

タスク固有の構成

合成コメントの生成

コンテキストに応じたレスポンスの生成

ユーザーフィードバックからの製品機能の抽出

結論

掲示される

投稿タグ

カテゴリ

Docker Model Runner で Gemma 3 を実行する: 完全にローカルな GenAI 開発者エクスペリエンス

ローカルなGenAI開発の力

Gemma 3 を使用した Docker Model Runner のセットアップ

ケーススタディ:コメント処理システム

実装の詳細

ローカルモデルを使用するためのOpenAI SDKの構成

タスク固有の構成

合成コメントの生成

コンテキストに応じたレスポンスの生成

ユーザーフィードバックからの製品機能の抽出

結論

関連記事

MCPのDocker化 - エコシステムに発見、シンプルさ、信頼性をもたらす

Docker Model Runner の紹介: GenAI モデルをローカルでビルドして実行するためのより良い方法

Docker デスクトップ 4.40:LLM をローカルで実行する Model Runner、より強力な Docker AI Agent、拡張された AI Tools Catalog

掲示 される

投稿タグ

カテゴリ

掲示される