In technology, response time is the time a system or functional unit takes to react to a given input.


Response time is the total amount of time it takes to respond to a request for service. That service can be anything from a memory fetch, to a disk IO, to a complex database query, or loading a full web page. Ignoring transmission time for a moment, the response time is the sum of the service time and wait time. The service time is the time it takes to do the work you requested. For a given request the service time varies little as the workload increases – to do X amount of work it always takes X amount of time. The wait time is how long the request had to wait in a queue before being serviced and it varies from zero, when no waiting is required, to a large multiple of the service time, as many requests are already in the queue and have to be serviced first.

With basic queueing theory math[1] you can calculate how the average wait time increases as the device providing the service goes from 0-100% busy. As the device becomes busier, the average wait time increases in a non-linear fashion. The busier the device is, the more dramatic the response time increases will seem as you approach 100% busy; all of that increase is caused by increases in wait time, which is the result of all the requests waiting in queue that have to run first.

Transmission time gets added to response time when your request and the resulting response has to travel over a network and it can be very significant.[2] Transmission time can include propagation delays due to distance (the speed of light is finite), delays due to transmission errors, and data communication bandwidth limits (especially at the last mile) slowing the transmission speed of the request or the reply.

Real-time systems

In real-time systems the response time of a task or thread is defined as the time elapsed between the dispatch (time when task is ready to execute) to the time when it finishes its job (one dispatch). Response time is different from WCET which is the maximum time the task would take if it were to execute without interference. It is also different from deadline which is the length of time during which the task's output would be valid in the context of the specific system. And it has a relation to the TTFB, which is the time between the dispatch and the time when the response starts.

Display technologies

Response time is the amount of time a pixel in a display takes to change. It is measured in milliseconds (ms). Lower numbers mean faster transitions and therefore fewer visible image artifacts. Display monitors with long response times would create display motion blur around moving objects, making them unacceptable for rapidly moving images. Response times are usually measured from grey-to-grey transitions, but there is no industry standard.[3]

Interrupt latency

In computing, interrupt latency is the time that elapses from when an interrupt is generated to when the source of the interrupt is serviced. For many operating systems, devices are serviced as soon as the device's interrupt handler is executed. Interrupt latency may be affected by microprocessor design, interrupt controllers, interrupt masking, and the operating system's (OS) interrupt handling methods.

Latency (engineering)

Latency is a time interval between the stimulation and response, or, from a more general point of view, a time delay between the cause and the effect of some physical change in the system being observed. Latency is physically a consequence of the limited velocity with which any physical interaction can propagate. The magnitude of this velocity is always less than or equal to the speed of light. Therefore, every physical system will experience some sort of latency, regardless of the nature of stimulation that it has been exposed to.

The precise definition of latency depends on the system being observed and the nature of stimulation. In communications, the lower limit of latency is determined by the medium being used for communications. In reliable two-way communication systems, latency limits the maximum rate that information can be transmitted, as there is often a limit on the amount of information that is "in-flight" at any one moment. In the field of human–machine interaction, perceptible latency has a strong effect on user satisfaction and usability.

