What formulas determine cost per token and yearly depreciation for LLM inference?
To estimate the amount of hardware and software licenses required and the associated cost, follow these steps and a hypothetical example First, collect and identify the cost information corresponding to both hardware and software. Next, calculate the total cost following the steps: Number of servers is calculated as the number of instances times the GPUs per instance, divided by the number of GPUs per server. Yearly server cost is calculated as the initial server cost divided by the depreciation period (in years), adding the yearly software licensing and hosting costs per server. Total cost is