Speed Perception

Understanding and Measuring Perceived Performance

http://instartlogic.github.io/p/spdperception

Parvez Ahammad

Head of Data Science and Machine Learning
Instart Logic


Twitter: @perceptPA
Blog: www.parvez-ahammad.org

Estelle Weyl

Open Web Evangelist
Instart Logic

Twitter: @webdevtips, @estellevw, @standardista
Blog: www.standardista.com

HTML5 and CSS3 for the Real World Animations and Transitions with CSS MObile HTML5 Web Performance Daybook CSS: The Definitive Guide Flexible boxes in CSS

Speed Perception

  1. Understanding performance
  2. Understanding perceived performance
  3. Measuring performance & perceived performance

Performance

Which feels faster?

Performance

  • Time to Load
  • Time until usable
  • Jitter
  • Responsiveness
  • Smoothness

RAIL

Four phases of interaction: end-user’s perception

  1. Response to Input touch_app
  2. Animation & Scrolling directions_run
  3. Idle alarm
  4. Page Load cached

Video: How Users Perceive the Speed of The Web (2015): Paul Irish / Google

Web Performance

Download
# of resources: images, fonts, HTML, scripts, and CSS loaded
Parse
File size of above resources
Execute
Parsing & Painting
Perceived Performance
Users perception of the speed of the load and reaction time.

What's involved in a page load?

latency

  1. HTTP request with protocol, host, port, and path.
  2. DNS lookup (find host IP)
  3. Socket opened between browser and server hosting the initial request content
  4. HTTP request sent
  5. Server handles request
  6. Server sends HTTP response
  7. The browser receives and parses response
  8. Initial DOM tree is built
  9. Additional requests are made based on resources requested in initial response, including images, stylesheets, and scripts.

    Go back to step 2 or 4 for each resource request.
  10. Stylesheets (blocking) are parsed. CSSOM built.
  11. Scripts (blocking) are parsed and executed. DOM updated.
  12. Content gets rendered

What's involved in a page load?

  • HTTP request
  • DNS lookup
  • TCP Connect
  • HTTP request sent
  • Server Magic
  • Server Sends response
  • Browser receives/parses response
  • Resources fetched from Cache
  • Parse & Execute Scripts
  • Render Site
  • Each request: go back to DNS lookup or HTTP Request
  • Stylesheets are blocking
  • Scripts are blocking.

Navigation Timing API metrics

Navigation Timing API metrics

Objective v. Subjective

Load Time v. Visually Complete

Load Times: 3,729ms v. 3,768ms

Visually Complete: 16s v. 8.7s

Staples
Wolferman's

WPT Metrics

Film Strip

  • Visually Complete
  • Last Change
  • Document Complete
  • Fully Loaded

Graphs

  • Visual Progress (SI: 4,462 v 5,902)
  • Timings
  • Requests (286 v 178)
  • Bytes (3.37Mb v. 2.19Mb)

timings, # of bytes, # of requests
above fold QoE measurements

Timings

TTFB
Time from initial navigation until first byte is received by browser
Start Render
Time from start of initial navigation to the first non-white content painted to browser.
Green Line in WPTest
Load Time (onLoad)
From navigation start to document complete
Load Time (Fully Loaded)
Metrics collected up until there was 2s of no network activity after Document Complete.
Speed Index
Proportional representation of how quickly user-visible content was rendered.

Speed Index (& Perceptual Speed Index)

Speed Index

Metric on above-fold visual Quality of Experience

  • Created by Patrick Meenan (Google)
  • Used on WebPage Test

Speed Index

Aggregate function on quickness of above-the-fold visual completion:

  • speed index graph for both sites
  • speed index graph for staples4,462
  • speed index graph for wolfermans5,902

equation for speed index

Measurement of visual progress in Speed Index

  • Frame-by-frame VC progress is computed from pixel-histogram comparisons

    equation for speed index

  • Pixel-wise similarity (mean histogram difference a.k.a. MHD) doesn’t capture visual perception!
    • Perception of Shape / Color / Object similarity

Pixel-wise similarity doesn’t capture shape similarity

Black/White = 50/50             MHD (Mean Histogram Difference) = 0

  • box that is 50% white and 50% black
  • box that is 50% white and 50% black
  • box that is 50% white and 50% black
  • box that is 50% white and 50% black
  • box that is 50% white and 50% black
  • box that is 50% white and 50% black

Pixel-wise similarity doesn’t capture color similarity

Speed Index

Aggregate function on quickness of above-the-fold visual completion:

  • speed index graph for both sites
  • speed index graph for staples4,462
  • speed index graph for wolfermans5,902

equation for speed index

Proposal for a perceptually oriented visual QoE metric

  • Update: Frame-by-frame VC progress computation using SSIM

Perceptual Speed Index

Frame-by-frame VC progress computation using SSIM

equation for perceptual speed index

Without Jitter

With Jitter

PSI v. SI

  • SI and PSI: linearly correlated
  • Visual jitter / layout thrashing? PSI > SI
    • PSI appears higher when visual jitter exists (Pop-up ads / large lay-out changes / etc.)
  • SSIM based visual progress measurements match human perception more closely than MHD
  • SSIM / MHD swap doesn’t affect websites without visual jitter

Staples

Wolfermans

PSI v. SI

Speed Index

  • Primarily focused on progress of above-fold loading
  • Does not account for layout stability

Perceptual Speed Index

  • A perceptually oriented metric to measure above-fold visual QoE
  • Designed to account for visual jitter (layout stability)
  • Complementary to SI

SpeedPerception

“SpeedPerception is a large-scale web performance crowdsourcing study focused on the perceived loading performance of above-the-fold content.”

Premise: Perception of perceived performance is relative.

Credits & Links: SpeedPerception

Phase-1 crowd sourced 07/28/2016 - 09/30/2016.

Study Hypotheses

Hypothesis 1: Visual metrics will perform better than non-visual/network metrics


Hypothesis 2: No single metric can explain human choices with 90%+ accuracy


Hypothesis 3: User will not wait until “Visual Complete” to make their choice (despite the explicit instruction to wait until video turns grey)

Study Metrics

  • 5,444 sessions, of which 51% were complete and valid
  • 77,482 votes, of which 75% were valid
  • graph demonstrating each of the 160 pairs webkit-font-feature-settings tested between 230 and 330 times

Feedback

Perception of speed and UX strongly impacted by popups / overlays

histogram of comments made, highlighting that pop ups was commonly mentioned

Hypothesis 1

Hypothesis 1: Visual metrics will perform better than non-visual/network metrics

Not True

Questions to Consider

  • Does presence of visual jitter / interstitials interfere with metric performance?
    • Can metrics be improved?
  • Will there be different trends for sites that are free of visual jitter like modals and overlays?
  • Is it possible to automatically predict the presence of jitter to help choose a better set of metrics?

Hypothesis 1: Visual metrics will perform better than non-visual/network metrics

True

Hypothesis 2

Hypothesis 2: No single metric can explain human choices with 90%+ accuracy

True

Hypothesis 2: No single metric can explain human choices with 90%+ accuracy

Still True

Conclusions & Thoughts

  • There appears to be no one unicorn metric but, is there a combination synthetic metric (joint ML model) that will do a better job?
  • People only looked two videos and made the call. Is there some additional information that we can extract from videos that will improve our models?

Hypothesis 3

Hypothesis 3: User will not wait until “Visual Complete” to make their choice (despite the explicit instruction to wait until video turns grey)

Speed Perception Phase 2

  • How do visual jitter & interstitials impact perceived performance?
    • Do they interfere with metric performance?
    • Can metrics be improved?
    • Are sites free of visual jitter like modals and overlays viewed as more performant?
    • Is it possible to automatically predict the presence of jitter to help choose a better set of metrics?
  • Does a long DOM Content Loaded impact perceived performance?

User Experience > Developer Experience

Thank you

Speed Perception: Understanding and Measuring Perceived Performance

http://instartlogic.github.io/p/spdperception