Loose-Info.com
Last Update 2026/02/07
TOP - 各種テスト - LLM - ローカルLLMの実測値比較 Gemma 3 (QAT)

低スペック寄りのPCでローカルLLMを動作させた際の記録です。
LLM以外の仮想マシンなどが起動され、多少負荷がかかった状態で実行しています。
ベンチマークなどでLLMの性能を評価する内容ではありません。

検証用PC

OS

Debian GNU/Linux 12 (bookworm)

CPU

Intel(R) Core(TM) i5-14400F

GPU

GeForce RTX 3060 12GB

メモリ

DDR4 PC4-25600 32GB × 4

SSD

crucial P310 CT1000P310SSD8-JP


構築環境 : Docker + Ollama (特別な設定などは無い状態)

検証用プロンプト

おすすめの日本の絶景を教えてください。東西南北、10箇所程度。

Gemma 3 (QAT)

GPU無し 事前のモデルのロード無し
1b-it-qat(36.9TPS)   4b-it-qat(12.0TPS)   12b-it-qat(4.82TPS)   27b-it-qat(2.22TPS)  
GPU使用 事前のモデルのロード無し
1b-it-qat(193TPS)   4b-it-qat(84.7TPS)   12b-it-qat(36.1TPS)   27b-it-qat(5.06TPS)  
GPU使用 事前にモデルをロード済み
1b-it-qat(191TPS)   4b-it-qat(84.0TPS)   12b-it-qat(35.8TPS)   27b-it-qat(5.07TPS)  

TPS(tokens/s) は eval_count / eval_duration により算出

gemma3:1b-it-qat(GPU無し 事前のモデルのロード無し)

Model parameters 999.89M context length 32768 embedding length 1152 quantization Q4_0 2026-02-07 total_duration(合計時間) : 23114286785 (23.114s) load_duration(モデルのロード時間) : 745412115 ( 0.745s) prompt_eval_count(評価されたプロンプトのトークン数) : 27 prompt_eval_duration(プロンプトの評価時間) : 128995863 ( 0.013s) eval_count(生成トークン数) : 808 eval_duration(生成時間) : 21869615838 (21.870s) real 0m23.125s user 0m0.026s sys 0m0.012s メモリ使用量(RSS) : 1735204 KB

gemma3:4b-it-qat(GPU無し 事前のモデルのロード無し)

Model parameters 4.3B context length 131072 embedding length 2560 quantization Q4_0 2026-02-07 total_duration(合計時間) : 73934311903 (73.934s) load_duration(モデルのロード時間) : 2051269517 ( 2.051s) prompt_eval_count(評価されたプロンプトのトークン数) : 27 prompt_eval_duration(プロンプトの評価時間) : 419686236 ( 0.420s) eval_count(生成トークン数) : 851 eval_duration(生成時間) : 71080709852 (71.081s) real 1m13.945s user 0m0.022s sys 0m0.016s メモリ使用量(RSS) : 5672308 KB

gemma3:12b-it-qat(GPU無し 事前のモデルのロード無し)

Model parameters 12.2B context length 131072 embedding length 3840 quantization Q4_0 2026-02-07 total_duration(合計時間) : 134438344073 (134.438s) load_duration(モデルのロード時間) : 3365638166 ( 3.366s) prompt_eval_count(評価されたプロンプトのトークン数) : 43 prompt_eval_duration(プロンプトの評価時間) : 2025412282 ( 2.025s) eval_count(生成トークン数) : 621 eval_duration(生成時間) : 128753220603 (128.753s) real 2m14.450s user 0m0.033s sys 0m0.014s メモリ使用量(RSS) : 11710796 KB

gemma3:27b-it-qat(GPU無し 事前のモデルのロード無し)

Model parameters 27.4B context length 131072 embedding length 5376 quantization Q4_0 2026-02-07 total_duration(合計時間) : 731943453087 (731.943s) load_duration(モデルのロード時間) : 3128573447 ( 3.129s) prompt_eval_count(評価されたプロンプトのトークン数) : 43 prompt_eval_duration(プロンプトの評価時間) : 4754642855 ( 4.755s) eval_count(生成トークン数) : 1607 eval_duration(生成時間) : 723332071807 (723.332s) real 12m11.955s user 0m0.095s sys 0m0.030s メモリ使用量(RSS) : 21615816 KB

gemma3:1b-it-qat(GPU使用 事前のモデルのロード無し)

Model parameters 999.89M context length 32768 embedding length 1152 quantization Q4_0 2026-02-07 total_duration(合計時間) : 6168774421 (6.169s) load_duration(モデルのロード時間) : 752172466 (0.752s) prompt_eval_count(評価されたプロンプトのトークン数) : 27 prompt_eval_duration(プロンプトの評価時間) : 11596649 (0.012s) eval_count(生成トークン数) : 959 eval_duration(生成時間) : 4978427788 (4.978s) real 0m6.180s user 0m0.031s sys 0m0.000s +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.261.03 Driver Version: 535.261.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3060 On | 00000000:01:00.0 On | N/A | | 0% 46C P2 139W / 170W | 1579MiB / 12288MiB | 83% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 1259 G /usr/lib/xorg/Xorg 107MiB | | 0 N/A N/A 1934 G xfwm4 2MiB | | 0 N/A N/A 2458 G /usr/bin/x-www-browser 200MiB | | 0 N/A N/A 31063 C /usr/bin/ollama 1256MiB | +---------------------------------------------------------------------------------------+ メモリ使用量(RSS) : 1073432 KB

gemma3:4b-it-qat(GPU使用 事前のモデルのロード無し)

Model parameters 4.3B context length 131072 embedding length 2560 quantization Q4_0 2026-02-07 total_duration(合計時間) : 12730883298 (12.731s) load_duration(モデルのロード時間) : 1473597891 ( 1.474s) prompt_eval_count(評価されたプロンプトのトークン数) : 27 prompt_eval_duration(プロンプトの評価時間) : 22769313 ( 0.023s) eval_count(生成トークン数) : 917 eval_duration(生成時間) : 10828147333 (10.828s) real 0m12.742s user 0m0.020s sys 0m0.010s +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.261.03 Driver Version: 535.261.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3060 On | 00000000:01:00.0 On | N/A | | 0% 53C P2 163W / 170W | 4831MiB / 12288MiB | 94% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 1259 G /usr/lib/xorg/Xorg 107MiB | | 0 N/A N/A 1934 G xfwm4 2MiB | | 0 N/A N/A 2458 G /usr/bin/x-www-browser 200MiB | | 0 N/A N/A 31136 C /usr/bin/ollama 4508MiB | +---------------------------------------------------------------------------------------+ メモリ使用量(RSS) : 1978560 KB

gemma3:12b-it-qat(GPU使用 事前のモデルのロード無し)

Model parameters 12.2B context length 131072 embedding length 3840 quantization Q4_0 2026-02-07 total_duration(合計時間) : 25037799624 (25.038s) load_duration(モデルのロード時間) : 2250079527 ( 2.250s) prompt_eval_count(評価されたプロンプトのトークン数) : 43 prompt_eval_duration(プロンプトの評価時間) : 58829442 ( 0.059s) eval_count(生成トークン数) : 808 eval_duration(生成時間) : 22361751819 (22.362s) real 0m25.048s user 0m0.031s sys 0m0.000s +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.261.03 Driver Version: 535.261.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3060 On | 00000000:01:00.0 On | N/A | | 31% 60C P2 169W / 170W | 9983MiB / 12288MiB | 97% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 1259 G /usr/lib/xorg/Xorg 107MiB | | 0 N/A N/A 1934 G xfwm4 2MiB | | 0 N/A N/A 2458 G /usr/bin/x-www-browser 200MiB | | 0 N/A N/A 31235 C /usr/bin/ollama 9660MiB | +---------------------------------------------------------------------------------------+ メモリ使用量(RSS) : 2711480 KB

gemma3:27b-it-qat(GPU使用 事前のモデルのロード無し)

Model parameters 27.4B context length 131072 embedding length 5376 quantization Q4_0 2026-02-07 total_duration(合計時間) : 129983417904 (129.983s) load_duration(モデルのロード時間) : 3036630444 ( 3.037s) prompt_eval_count(評価されたプロンプトのトークン数) : 43 prompt_eval_duration(プロンプトの評価時間) : 475172873 ( 0.475s) eval_count(生成トークン数) : 638 eval_duration(生成時間) : 126192161008 (126.192s) real 2m10.002s user 0m0.060s sys 0m0.009s +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.261.03 Driver Version: 535.261.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3060 On | 00000000:01:00.0 On | N/A | | 0% 54C P2 67W / 170W | 11736MiB / 12288MiB | 17% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 1259 G /usr/lib/xorg/Xorg 107MiB | | 0 N/A N/A 1934 G xfwm4 2MiB | | 0 N/A N/A 2458 G /usr/bin/x-www-browser 201MiB | | 0 N/A N/A 31328 C /usr/bin/ollama 11412MiB | +---------------------------------------------------------------------------------------+ メモリ使用量(RSS) : 10938060 KB

gemma3:1b-it-qat(GPU使用 事前にモデルをロード済み)

Model parameters 999.89M context length 32768 embedding length 1152 quantization Q4_0 2026-02-07 total_duration(合計時間) : 4930318282 (4.930s) load_duration(モデルのロード時間) : 73286852 (0.073s) prompt_eval_count(評価されたプロンプトのトークン数) : 27 prompt_eval_duration(プロンプトの評価時間) : 13136396 (0.013s) eval_count(生成トークン数) : 855 eval_duration(生成時間) : 4466034708 (4.466s) real 0m4.941s user 0m0.021s sys 0m0.009s +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.261.03 Driver Version: 535.261.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3060 On | 00000000:01:00.0 On | N/A | | 0% 53C P2 143W / 170W | 1580MiB / 12288MiB | 87% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 1259 G /usr/lib/xorg/Xorg 107MiB | | 0 N/A N/A 1934 G xfwm4 2MiB | | 0 N/A N/A 2458 G /usr/bin/x-www-browser 201MiB | | 0 N/A N/A 37878 C /usr/bin/ollama 1256MiB | +---------------------------------------------------------------------------------------+ メモリ使用量(RSS) : 1073776 KB

gemma3:4b-it-qat(GPU使用 事前にモデルをロード済み)

Model parameters 4.3B context length 131072 embedding length 2560 quantization Q4_0 2026-02-07 total_duration(合計時間) : 7888331877 (7.888s) load_duration(モデルのロード時間) : 69904156 (0.070s) prompt_eval_count(評価されたプロンプトのトークン数) : 27 prompt_eval_duration(プロンプトの評価時間) : 30720280 (0.031s) eval_count(生成トークン数) : 631 eval_duration(生成時間) : 7507783869 (7.508s) real 0m7.899s user 0m0.031s sys 0m0.001s +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.261.03 Driver Version: 535.261.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3060 On | 00000000:01:00.0 On | N/A | | 0% 58C P2 165W / 170W | 4832MiB / 12288MiB | 94% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 1259 G /usr/lib/xorg/Xorg 107MiB | | 0 N/A N/A 1934 G xfwm4 2MiB | | 0 N/A N/A 2458 G /usr/bin/x-www-browser 201MiB | | 0 N/A N/A 37968 C /usr/bin/ollama 4508MiB | +---------------------------------------------------------------------------------------+ メモリ使用量(RSS) : 1969820 KB

gemma3:12b-it-qat(GPU使用 事前にモデルをロード済み)

Model parameters 12.2B context length 131072 embedding length 3840 quantization Q4_0 2026-02-07 total_duration(合計時間) : 27324603146 (27.325s) load_duration(モデルのロード時間) : 67569592 ( 0.068s) prompt_eval_count(評価されたプロンプトのトークン数) : 43 prompt_eval_duration(プロンプトの評価時間) : 60669586 ( 0.061s) eval_count(生成トークン数) : 959 eval_duration(生成時間) : 26800558403 (26.801s) real 0m27.335s user 0m0.025s sys 0m0.006s +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.261.03 Driver Version: 535.261.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3060 On | 00000000:01:00.0 On | N/A | | 36% 64C P2 169W / 170W | 9984MiB / 12288MiB | 97% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 1259 G /usr/lib/xorg/Xorg 107MiB | | 0 N/A N/A 1934 G xfwm4 2MiB | | 0 N/A N/A 2458 G /usr/bin/x-www-browser 201MiB | | 0 N/A N/A 38053 C /usr/bin/ollama 9660MiB | +---------------------------------------------------------------------------------------+ メモリ使用量(RSS) : 2711660 KB

gemma3:27b-it-qat(GPU使用 事前にモデルをロード済み)

Model parameters 27.4B context length 131072 embedding length 5376 quantization Q4_0 2026-02-07 total_duration(合計時間) : 126546740980 (126.547s) load_duration(モデルのロード時間) : 71230491 ( 0.071s) prompt_eval_count(評価されたプロンプトのトークン数) : 43 prompt_eval_duration(プロンプトの評価時間) : 476466407 ( 0.476s) eval_count(生成トークン数) : 637 eval_duration(生成時間) : 125730518973 (125.731s) real 2m6.558s user 0m0.032s sys 0m0.012s +---------------------------------------------------------------------------------------+ | NVIDIA-SMI 535.261.03 Driver Version: 535.261.03 CUDA Version: 12.2 | |-----------------------------------------+----------------------+----------------------+ | GPU Name Persistence-M | Bus-Id Disp.A | Volatile Uncorr. ECC | | Fan Temp Perf Pwr:Usage/Cap | Memory-Usage | GPU-Util Compute M. | | | | MIG M. | |=========================================+======================+======================| | 0 NVIDIA GeForce RTX 3060 On | 00000000:01:00.0 On | N/A | | 33% 44C P2 66W / 170W | 11735MiB / 12288MiB | 18% Default | | | | N/A | +-----------------------------------------+----------------------+----------------------+ +---------------------------------------------------------------------------------------+ | Processes: | | GPU GI CI PID Type Process name GPU Memory | | ID ID Usage | |=======================================================================================| | 0 N/A N/A 1259 G /usr/lib/xorg/Xorg 107MiB | | 0 N/A N/A 1934 G xfwm4 2MiB | | 0 N/A N/A 2458 G /usr/bin/x-www-browser 200MiB | | 0 N/A N/A 38148 C /usr/bin/ollama 11412MiB | +---------------------------------------------------------------------------------------+ メモリ使用量(RSS) : 10929988 KB