当前位置: 首页 > news >正文

用llmfit来估算机器能运行的大模型

工具官方网站:https://www.llmfit.org/zh-cn
方法1. 下载docker镜像

C:\d>wsl root@DESKTOP-59T6U68:/mnt/c/d# docker pull ghcr.io/alexsjones/llmfit Trying to pull ghcr.io/alexsjones/llmfit:latest... Getting image source signatures Copying blob ff86ea2e5edc skipped: already exists Copying blob ae62bed2e6dd done Copying blob 1ee47fd61fcb done Copying blob f9cfedbd3651 done Copying config 1b8032be6f done Writing manifest to image destination Storing signatures 1b8032be6f4f332fc871d3391dd2a112a60516ea2781bfde09822c522bf1d43d root@DESKTOP-59T6U68:/mnt/c/d# docker run -itd -v /mnt/c/d:/par --network host --name llmfit ghcr.io/alexsjones/llmfit 00e39be8b38c2411788c066cdd212fbdfe5f39eb344f849f0ee930abee4ba6f6 root@DESKTOP-59T6U68:/mnt/c/d# docker exec -it llmfit Error: must provide a non-empty command to start an exec session: invalid argument root@DESKTOP-59T6U68:/mnt/c/d# docker exec -it llmfit bash Error: can only create exec sessions on running containers: container state improper

这个容器无法登录,查看日志,显示了json格式的结果

root@DESKTOP-59T6U68:/mnt/c/d# docker logs llmfit { "models": [ { "best_quant": "Q4_K_M", "capabilities": [ "Tool Use" ], "capability_ids": [ "tool_use" ], "category": "Coding", "context_length": 262144, "disk_size_gb": 6.86, "effective_context_length": 8192, "estimated_tps": 32.8, "fit_level": "Marginal", "gguf_sources": [], "installed": false, "is_moe": true, "license": null, "memory_available_gb": 9.05, "memory_required_gb": 6.6, "moe_offloaded_gb": null, "name": "Intel/Qwen3-Coder-Next-int4-AutoRound", "notes": [ "Context capped at 8192 tokens for estimation (model supports up to 262144; use --max-context to override)", "CPU-only: model loaded into system RAM", "MoE architecture, but expert offloading requires a GPU", "No GPU -- inference will be slow", "Baseline estimated speed: 32.8 tok/s" ], "parameter_count": "11.8B", "params_b": 11.82, "provider": "intel", "release_date": null, "run_mode": "CPU", "runtime": "llama.cpp", "runtime_label": "llama.cpp", "score": 88.9, "score_components": { "context": 100.0, "fit": 100.0, "quality": 85.0, "speed": 81.9 }, "total_memory_gb": 6.6, "use_case": "Code generation and completion", "utilization_pct": 72.9 }, { "best_quant": "Q8_0", "capabilities": [], "capability_ids": [], "category": "General", "context_length": 8192, "disk_size_gb": 3.5, "effective_context_length": 8192, "estimated_tps": 39.8, "fit_level": "Marginal", "gguf_sources": [ { "provider": "ggml-org", "repo": "ggml-org/DeepSeek-OCR-GGUF" } ], "installed": false, "is_moe": true, "license": null, "memory_available_gb": 9.05, "memory_required_gb": 1.9, "moe_offloaded_gb": null, "name": "deepseek-ai/DeepSeek-OCR", "notes": [ "CPU-only: model loaded into system RAM", "MoE architecture, but expert offloading requires a GPU", "No GPU -- inference will be slow", "Best quantization for hardware: Q8_0 (model default: Q4_K_M)", "Baseline estimated speed: 39.8 tok/s" ], "parameter_count": "3.3B", "params_b": 3.34, "provider": "DeepSeek", "release_date": null, "run_mode": "CPU", "runtime": "llama.cpp", "runtime_label": "llama.cpp", "score": 79.7, "score_components": { "context": 100.0, "fit": 76.8, "quality": 63.0, "speed": 99.6 }, "total_memory_gb": 1.9, "use_case": "General purpose", "utilization_pct": 21.0 }, { "best_quant": "Q8_0", "capabilities": [], "capability_ids": [], "category": "General", "context_length": 8192, "disk_size_gb": 3.56, "effective_context_length": 8192, "estimated_tps": 39.2, "fit_level": "Marginal", "gguf_sources": [], "installed": false, "is_moe": true, "license": null, "memory_available_gb": 9.05, "memory_required_gb": 1.9, "moe_offloaded_gb": null, "name": "deepseek-ai/DeepSeek-OCR-2", "notes": [ "CPU-only: model loaded into system RAM", "MoE architecture, but expert offloading requires a GPU", "No GPU -- inference will be slow", "Best quantization for hardware: Q8_0 (model default: Q4_K_M)", "Baseline estimated speed: 39.2 tok/s" ], "parameter_count": "3.4B", "params_b": 3.39, "provider": "DeepSeek", "release_date": null, "run_mode": "CPU", "runtime": "llama.cpp", "runtime_label": "llama.cpp", "score": 79.3, "score_components": { "context": 100.0, "fit": 76.8, "quality": 63.0, "speed": 98.0 }, "total_memory_gb": 1.9, "use_case": "General purpose", "utilization_pct": 21.0 }, { "best_quant": "Q8_0", "capabilities": [], "capability_ids": [], "category": "General", "context_length": 262144, "disk_size_gb": 3.51, "effective_context_length": 8192, "estimated_tps": 50.6, "fit_level": "Marginal", "gguf_sources": [], "installed": false, "is_moe": true, "license": null, "memory_available_gb": 9.05, "memory_required_gb": 1.9, "moe_offloaded_gb": null, "name": "dealignai/Gemma-4-26B-A4B-JANG_2L-CRACK", "notes": [ "Context capped at 8192 tokens for estimation (model supports up to 262144; use --max-context to override)", "CPU-only: model loaded into system RAM", "MoE architecture, but expert offloading requires a GPU", "No GPU -- inference will be slow", "Best quantization for hardware: Q8_0 (model default: Q4_K_M)", "Baseline estimated speed: 50.6 tok/s" ], "parameter_count": "3.3B", "params_b": 3.34, "provider": "dealignai", "release_date": null, "run_mode": "CPU", "runtime": "llama.cpp", "runtime_label": "llama.cpp", "score": 79.0, "score_components": { "context": 100.0, "fit": 76.8, "quality": 61.0, "speed": 100.0 }, "total_memory_gb": 1.9, "use_case": "General purpose", "utilization_pct": 21.0 }, { "best_quant": "Q8_0", "capabilities": [], "capability_ids": [], "category": "General", "context_length": 262144, "disk_size_gb": 4.96, "effective_context_length": 8192, "estimated_tps": 35.8, "fit_level": "Marginal", "gguf_sources": [], "installed": false, "is_moe": true, "license": null, "memory_available_gb": 9.05, "memory_required_gb": 2.6, "moe_offloaded_gb": null, "name": "dealignai/Gemma-4-26B-A4B-JANG_4M-CRACK", "notes": [ "Context capped at 8192 tokens for estimation (model supports up to 262144; use --max-context to override)", "CPU-only: model loaded into system RAM", "MoE architecture, but expert offloading requires a GPU", "No GPU -- inference will be slow", "Best quantization for hardware: Q8_0 (model default: Q4_K_M)", "Baseline estimated speed: 35.8 tok/s" ], "parameter_count": "4.7B", "params_b": 4.72, "provider": "dealignai", "release_date": null, "run_mode": "CPU", "runtime": "llama.cpp", "runtime_label": "llama.cpp", "score": 76.7, "score_components": { "context": 100.0, "fit": 83.0, "quality": 61.0, "speed": 89.4 }, "total_memory_gb": 2.6, "use_case": "General purpose", "utilization_pct": 28.7 } ], "system": { "available_ram_gb": 9.05, "backend": "CPU (x86)", "cpu_cores": 16, "cpu_name": "AMD Ryzen 7 8845H w/ Radeon 780M Graphics", "gpu_count": 0, "gpu_name": null, "gpu_vram_gb": null, "gpus": [], "has_gpu": false, "total_ram_gb": 9.72, "unified_memory": false } } root@DESKTOP-59T6U68:/mnt/c/d#

方法2.下载二进制文件

C:\d>wget https://kkgithub.com/AlexsJones/llmfit/releases/download/v0.9.18/llmfit-v0.9.18-x86_64-pc-windows-msvc.zip llmfit-v0.9.18-x86_64-pc-wind 100%[=================================================>] 2.93M 5.66MB/s in 0.5s 2026-05-03 10:51:31 (5.66 MB/s) - 'llmfit-v0.9.18-x86_64-pc-windows-msvc.zip' saved [3068451/3068451] C:\d>wget https://kkgithub.com/AlexsJones/llmfit/releases/download/v0.9.18/llmfit-v0.9.18-x86_64-unknown-linux-gnu.tar.gz llmfit-v0.9.18-x86_64-unknown 100%[=================================================>] 3.58M 6.69MB/s in 0.5s 2026-05-03 10:51:51 (6.69 MB/s) - 'llmfit-v0.9.18-x86_64-unknown-linux-gnu.tar.gz' saved [3758629/3758629] C:\d>wget https://kkgithub.com/AlexsJones/llmfit/releases/download/v0.9.18/llmfit-v0.9.18-x86_64-unknown-linux-musl.tar.gz llmfit-v0.9.18-x86_64-unknown 100%[=================================================>] 3.68M 7.03MB/s in 0.5s 2026-05-03 10:52:07 (7.03 MB/s) - 'llmfit-v0.9.18-x86_64-unknown-linux-musl.tar.gz' saved [3863166/3863166]

不带参数运行,显示一个TUI(文字界面),可以上下键浏览

C:\d>llmfit

带参数浏览,列出最适合的前10个

C:\d>llmfit fit --perfect -n 10 === System Specifications === CPU: AMD Ryzen 7 8845H w/ Radeon 780M Graphics (16 cores) Total RAM: 12.80 GB Available RAM: 5.46 GB Backend: Vulkan GPU: AMD Radeon 780M Graphics (3.00 GB VRAM, Vulkan) (197 models hidden — incompatible backend) === Model Compatibility Analysis === Found 10 compatible model(s) ╭────────────┬────────────────────────────────────────────┬─────────────┬──────┬───────┬────────────┬───────┬───────────┬──────┬───────┬─────────┬─────────────╮ │ Status │ Model │ Provider │ Size │ Score │ tok/s est. │ Quant │ Runtime │ Mode │ Mem % │ Context │ Added to HF │ ├────────────┼────────────────────────────────────────────┼─────────────┼──────┼───────┼────────────┼───────┼───────────┼──────┼───────┼─────────┼─────────────┤ │ 🟢 Perfect │ meta-llama/Llama-3.2-1B-Instruct │ Meta │ 1.2B │ 79 │ 106.8 │ Q8_0 │ llama.cpp │ GPU │ 61.3% │ 4k │ — │ │ 🟢 Perfect │ cazzz307/Abliterated-Llama-3.2-1B-Instruct │ cazzz307 │ 1.2B │ 79 │ 106.8 │ Q8_0 │ llama.cpp │ GPU │ 61.3% │ 4k │ — │ │ 🟢 Perfect │ Vikhrmodels/Vikhr-Llama-3.2-1B-Instruct │ vikhrmodels │ 1.2B │ 79 │ 106.8 │ Q8_0 │ llama.cpp │ GPU │ 62.6% │ 4194k │ — │ │ 🟢 Perfect │ RedHatAI/Llama-3.2-1B-Instruct-FP8 │ redhatai │ 1.5B │ 79 │ 88.1 │ Q8_0 │ llama.cpp │ GPU │ 72.4% │ 4194k │ — │ │ 🟢 Perfect │ RedHatAI/Llama-3.2-1B-Instruct-FP8-dynamic │ redhatai │ 1.5B │ 79 │ 88.1 │ Q8_0 │ llama.cpp │ GPU │ 72.4% │ 4194k │ — │ │ 🟢 Perfect │ Qwen/Qwen2.5-1.5B-Instruct │ Alibaba │ 1.5B │ 79 │ 85.5 │ Q8_0 │ llama.cpp │ GPU │ 74.1% │ 32k │ — │ │ 🟢 Perfect │ Qwen/Qwen2-1.5B-Instruct │ Alibaba │ 1.5B │ 79 │ 85.5 │ Q8_0 │ llama.cpp │ GPU │ 74.1% │ 32k │ — │ │ 🟢 Perfect │ Qwen/Qwen2.5-Math-1.5B-Instruct │ Alibaba │ 1.5B │ 79 │ 85.5 │ Q8_0 │ llama.cpp │ GPU │ 72.4% │ 4k │ — │ │ 🟢 Perfect │ RedHatAI/Qwen2-1.5B-Instruct-FP8 │ redhatai │ 1.5B │ 79 │ 85.5 │ Q8_0 │ llama.cpp │ GPU │ 74.1% │ 32k │ — │ │ 🟢 Perfect │ LiquidAI/LFM2.5-1.2B-Instruct │ Liquid AI │ 1.2B │ 78 │ 112.8 │ Q8_0 │ llama.cpp │ GPU │ 66.0% │ 128k │ 2026-01-06 │ ╰────────────┴────────────────────────────────────────────┴─────────────┴──────┴───────┴────────────┴───────┴───────────┴──────┴───────┴─────────┴─────────────╯ Note: tok/s values are baseline estimates; real runtime depends on engine/runtime.
http://www.cnnetsun.cn/news/2205201.html

相关文章:

  • 从‘暹罗双胞胎’到AI识图:手把手用Python和Keras复现一个Siamese Network图片相似度比对模型
  • Label Studio:开源数据标注平台的终极解决方案
  • 如何用BiliLocal为本地视频添加弹幕:完整使用指南
  • 告别激活烦恼:KMS_VL_ALL_AIO智能激活工具全面指南
  • Agent 工作流工具 OpenClaw 如何对接 Taotoken 的 OpenAI 兼容侧
  • OpenClaw记忆模板:为AI助手构建结构化长期记忆的实践指南
  • Pydantic + mypy + pyright 标注协同配置全链路实践(2024企业级配置白皮书)
  • 告别枯燥理论:用5个生动比喻理解RLC串并联电路中的相位与阻抗
  • 如何零基础创建专业演示文稿:PPTist在线幻灯片编辑器的完整指南
  • DDrawCompat完全指南:Windows 11上经典游戏兼容性修复的终极解决方案
  • 大语言模型在文档自动化布局中的应用与实践
  • 3DMax建模效率翻倍?这5款小众但超实用的插件,室内设计师都在悄悄用
  • 如何在5分钟内实现Windows安卓应用无缝运行?终极轻量解决方案揭秘
  • 别再让电机烧了你的单片机!51单片机循迹小车供电方案详解(LM2596 vs 7805)
  • 如何让经典《植物大战僵尸》完美适配现代宽屏?PvZWidescreen模组全面解析
  • Sloppy:基于规则优先架构的AI智能体运行时设计与实践
  • 告别手动打字幕:VideoSrt让视频字幕制作效率提升10倍
  • 隐藏模拟位置终极指南:LSPosed模块完全使用教程
  • 使用Taotoken后如何清晰观测各模型的Token消耗与月度成本分布
  • 用FPGA在HDMI上显示自定义字符:从COE文件到OSD叠加的保姆级教程
  • 终极指南:5分钟免费解锁Cursor Pro高级功能完整方案
  • 从零开始搭建企业级文件管理系统:开源Free-Fs实战指南
  • 为什么你需要一个专业的桌面歌词工具?LyricsX如何重新定义音乐体验
  • 3步搞定Windows电脑直接运行安卓应用:APK安装器完全指南
  • TrafficMonitor插件终极指南:如何用免费插件打造个性化Windows任务栏监控中心
  • 基于安卓的摄像头防偷拍检测系统毕设源码
  • 告别手动排查!用Golin这款开源工具,5分钟搞定等保2.0基线核查报告
  • 终极网盘直链解析工具:一键获取八大平台高速下载地址,告别限速烦恼
  • APKMirror终极指南:如何安全下载Android应用并突破地区限制
  • 如何通过Qwerty Learner提升英语打字速度与单词记忆效率:终极指南