← Back to Discover
guqiong96

guqiong96/Lvllm

PythonApache-2.0active
89Health

LvLLM is a special NUMA extension of vllm that makes full use of CPU and memory resources, reduces GPU memory requirements, and features an efficient GPU parallel and NUMA parallel architecture, supporting hybrid inference for MOE large models.

Stars370
Forks33
Open Issues3
Contributors33
Last Push0d ago

Health Breakdown

Activity
25
Community
25
Maintenance
14
Popularity
25
#cpu#decode#gpu#hybrid#inference#model#moe#numa#parallelism#prefill#vllm
View on GitHub ↗Issues (3) ↗Pull Requests ↗Wiki ↗

Should you contribute to guqiong96/Lvllm?

guqiong96/Lvllm has a FoundDev health score of 89/100, which puts it in the active-and-maintained tier. The maintainer team is shipping recently, issues are being closed, and a PR you open this week has a realistic chance of being reviewed.

Last push was 0 days ago — that signals an actively maintained project. New issues are likely to get a maintainer response within days. The project is written primarily in Python, so prior Python experience will shorten ramp-up.

Licensed under Apache-2.0, a standard OSI-approved license — safe to contribute to under normal employer IP policies.

Community

guqiong96
guqiong96/Lvllm
PythonApache 2.0
89

LvLLM is a special NUMA extension of vllm that makes full use of CPU and memory resources, reduces GPU memory requirements, and features an efficient GPU parallel and NUMA parallel architecture, supporting hybrid inference for MOE large models.

active
37033 contributors3 issues
0d ago

More Python repos

dancinlab
dancinlab/hexa-lang
💎 Native compiler with atlas-bound theorems — 8 strict-lint stages · citation-enforced · no LLVM · no C-transpile
22494
jobovy
jobovy/galpy
Galactic Dynamics in python
27792
isaac-sim
isaac-sim/IsaacLab
Unified framework for robot learning built on NVIDIA Isaac Sim
7.4k92