How to run massive large language models locally on consumer hardware using AirLLM's layer-wise inference — no expensive GPU required.