The Randomness Problem: How Lava Lamps Protect the Internet

The Randomness Problem: How Lava Lamps Protect the Internet
    Watch the video

    click to begin

    Youtube

    SciShow is supported by Brilliant.org.
    Go to Brilliant.org/SciShow to get 20% off of an annual Premium subscription.
    [♩INTRO]
    You jog onto the field for another NFL football game
    or at least your team's video game avatars do,
    which, I mean, let's face it, is about as close as you'll get.
    The virtual ref tosses his virtual coin to see who gets dibs on the ball,
    and…you lose!
    Once again, you're stuck letting the computer receive first.
    It happens so often it seems unfair.
    Come to think of it, what if it is?
    After all, the computer's following a fixed program.
    How could it ever give you a fair, random coin toss?
    The whole point of computers is that they're predictable!
    And if those coin tosses weren't random, how would you ever know?
    These questions matter for more than just games.
    Getting computers to behave randomly is crucial for simulations,
    scientific studies, and cybersecurity.
    But for all its importance, randomness is surprisingly hard to define,
    and even harder to produce.
    Computers just aren't that good at being random.
    In fact, most computerized randomness is basically faked;
    it's totally predictable.
    For real, honest-to-goodness randomness,
    engineers turn to some pretty odd tricks.
    It turns out a lot of the internet is kept secure by lava lamps,
    Geiger counters, and the occasional rooftop microphone!
    There are lots of uses for digital randomness.
    Think of a pollster who wants to call a representative sample of people,
    or medical researchers deciding who in their study gets which treatment.
    For the statistics to work, they have to choose the phone numbers
    or treatment groups randomly.
    Another example is predicting the behavior of very complex systems,
    like weather forecasts or traffic patterns.
    These systems may not have simple,
    neat equations that tell you how they'll develop.
    Instead, simulations often model the world
    as a collection of small randomized components.
    So a traffic simulator might assume that cars randomly show up
    at a highway entrance at an average rate of 10 per minute,
    then watch how the simulation proceeds.
    Randomness is also a critical part of basic network technology.
    If your laptop sends a packet of data over the wifi network at the same moment
    as your roommate's, the packets collide
    they garble each other, and each computer has to resend.
    But it wouldn't help anything if each computer, following the same deterministic
    program, tried to retransmit at exactly the same time.
    You'd just get collisions over and over!
    Instead, they wait a random amount of time after a collision.
    This type of symmetry breaking is impossible
    without some form of randomness.
    But the most common use for randomness is both high-stakes
    and very necessary: encryption, which requires secret data
    that an adversary can't predict.
    For example, every time you connect to a secure website
    like an online banking portal, your computer and the server
    running the site have to agree on encryption keys.
    The keys are like a more sophisticated version of a secret decoder ring:
    each computer can scramble its messages so that only someone
    with the same key can decode it.
    Obviously, no one should be able to guess the secret key, or they'll be able to
    see your bank account information or even steal your money.
    So your computer and the server base the keys on random numbers to make
    them nearly impossible for your online nemesis to guess.
    Randomness is so valuable that when the RAND Corporation published a
    600-page book with a million random digits in 1955,
    it was considered a landmark contribution to science.
    Now that computers have gotten faster,
    the RAND book is mostly useful as a source of hilarious Amazon reviews.
    But getting computers to generate numbers that are actually random
    is still much harder than you might think.
    The first problem to solve is how to even define randomness,
    because our intuitions about what's random aren't always great.
    Take this demonstration from paleontologist and author Stephen Jay Gould.
    In which image would you guess the dots were placed totally at random?
    If you guessed the one on the left, you're not alone.
    Most people see the clumps in the right image as patterns.
    But in fact, it's just the opposite!
    It's the even spacing on the left that's the result of a pattern: the program that
    generated the dots was modified to forbid them from being too close together.
    Here's another brain twist: tossing a coin twenty times is just as likely to
    produce ten heads and then ten tails as it is to mix them up.
    Neither sequence is inherently less random.
    To define randomness properly, the trick is to think not just about particular
    string of numbers, but about an infinite sequence of them.
    If a so-called random coin flipper kept producing clumps of heads and clumps
    of tails forever, then we could definitively say it's not random.
    Mathematicians have used the concept of infinite sequences to come up with a
    few different definitions for randomness, but the simplest one describes
    randomness as unpredictability.
    If you were betting on future digits in the sequence,
    you couldn't find an algorithm
    a mathematical procedure for a computer to follow
    whose predictions would make money in the long run.
    The other definitions all turn out to be equivalent to this:
    if a sequence is random, you can't describe it using some sort of algorithm.
    Maybe you can start to see the problem here: if by definition, you can't use a
    program to describe a random sequence,
    how are you supposed to write a program to generate one?
    Not only that, but if you try to build a random number generator,
    how can you tell if you've succeeded?
    You can't check a whole infinite sequence for randomness,
    or check all possible betting algorithms to see if any make money.
    When it comes to generating random numbers, most of the time,
    computers make do with pseudo-random number generators, or PRNGs.
    The numbers they generate aren't truly random … but they're close.
    The algorithms take a seed, some value to start with,
    generate a number from it, update the seed, and repeat.
    For instance, mathematician John von Neumann suggested
    the middle square method: you take the seed, square it,
    and grab the middle few digits as the next random number.
    That number also serves as the next seed.
    Unfortunately, the middle square method tends to start looping
    through the same cycle of numbers.
    But computer scientists have designed many more sophisticated algorithms
    that do better, some with awesome names like "Fortuna"
    and "the Mersenne Twister."
    To check if a pseudo-random number generator
    is a good approximation of randomness,
    you can use a combination of theoretical analysis and empirical testing.
    Theoretical analysis looks not at a particular sequence of numbers,
    but at the process that produced them.
    Even if the process wasn't truly random, mathematicians can sometimes prove
    statements like "This algorithm will generate at least 14 gazillion digits
    before the sequence wraps around and starts repeating."
    For empirical testing, you just generate some random numbers
    and see if the statistics look right,
    things like whether evens and odds are equally common.
    The stats will always be a bit off, so you can never say for sure that the
    sequence was randomly generated.
    But you can at least estimate how likely the stats you're seeing should be.
    Under most statistical tests, good pseudo-random numbers are
    indistinguishable from true randomness…
    if you don't know the initial seed.
    That's good enough for most purposes.
    I mean, who cares if your virtual football game isn't 100% unpredictable?
    It's not a big deal for things like medical research or predicting traffic patterns,
    either, as long as the numbers produce the same statistics as randomness.
    In fact, sometimes pseudo-randomness has real advantages.
    A randomized program is easier to troubleshoot if you can make it do all the
    same things again by providing the same seed.
    And in some uses of PRNGs, determinism is a core design feature.
    For example, a remote car key fob works by running a PRNG with a seed
    known only to it and your car.
    When you click the unlock button, the fob radios its next random number to the
    car, which checks that number against the next outputs of the PRNG.
    To a thief, the numbers look unpredictable.
    But the system only works because the key and the car produce the same
    series of numbers from their shared seed.
    Sometimes, though, PRNGs aren't enough.
    Even if the algorithm is flawless, an attacker who somehow figures out the seed
    can predict all the numbers.
    Ironically, the website Hacker News
    once became a poster child for this problem.
    Back in 2009, a security researcher realized that the site's random number
    generator, which assigned IDs to logged-in users, was seeded with the time
    when the server started up.
    By triggering a server restart, the attacker was able to narrow down the
    possible seeds to a small set of numbers.
    That let him predict other users' IDs and impersonate them.
    A similar problem a decade earlier allowed
    another team of researchers to cheat at online poker.
    A leaked seed could be especially catastrophic for a PRNG that was used for
    encryption: if an attacker uncovered the seed, they could guess the encryption
    keys, then go back and decrypt all the previous messages.
    So when the stakes are high,
    people get creative looking for sources of true randomness.
    The most common place to look is external noise
    factors that someone could in principle influence,
    but which the software can't predict.
    For example, the website Random.org has a rooftop microphone that captures
    atmospheric noise,
    unpredictable radio waves coming mostly from lightning and space.
    Most computers use more readily available sources of noise,
    like the timing between mouse clicks or incoming network messages.
    Some software will even ask you to scoot your mouse around
    while it's generating big encryption keys.
    For the highest-stakes uses, though, you want intrinsic randomness,
    something that's guaranteed unpredictable
    by the laws of physics no matter what the attacker knows or does.
    The Internet giant Cloudflare, which provides encryption
    for about 10% of all Internet traffic, points a video camera
    at a wall of lava lamps in the lobby of its San Francisco office.
    Meanwhile, in its London office,
    a camera watches three pendulums hanging from pendulums.
    Both are examples of chaotic systems:
    the tiniest inaccuracy in your knowledge of their configuration will send your
    future predictions wildly off-base.
    So those camera streams are effectively random.
    Intrinsic randomness can also come from smaller-scale physics.
    Some hardware random number generators record tiny fluctuations in the
    movements of electrons in the circuitry.
    And Geiger counters trained on decaying hunks of uranium will click at a rate
    determined by quantum mechanics, which is inherently probabilistic.
    So, by the strict definition, it's more likely than not
    that your football game isn't truly random.
    It's probably flipping a coin based on a simple PRNG
    seeded with the time you started playing.
    Even so, losing a bunch of coin flips is probably just a brief unlucky streak;
    you'd be hard-pressed to show any bias with larger statistical tests.
    For high-stakes contexts like security, though, rock-solid randomness
    is essential, and there's a whole cottage industry for supplying it.
    Lava lamps may not be the height of cool anymore, but as it turns out,
    they are still useful for security.
    Or maybe programmers just really miss the 70s.
    If you miss 1970s bedroom decor, or even if you don't,
    you can still strengthen your programming skills at Brilliant.org/SciShow
    with the Computer Science Fundamentals course.
    Even if you know nothing about computer science,
    the lessons and quizzes build on each other.
    So it's impossible to finish a lesson without improving.
    And if you are a confident computer programmer,
    the quizzes might still bend your brain a little bit, and you can always
    work on problems posed by other Brilliant.org premium members.
    Brilliant is offering the first 200 SciShow viewers to go to Brilliant.org/SciShow
    a 20% discount off of an annual premium subscription.
    So check it out!
    You'll definitely learn and have fun,
    and you'll be helping to support SciShow while you do!
    So, thanks!
    [♩OUTRO]
    6 'Undetectable' Poisons (and How to Detect Them) 5 Bizarre Aircraft That Pushed the Boundaries of Engineering Why You Don't Want Invisibility TeamViewer strikes back! Neil deGrasse Tyson's Best Arguments and Comebacks Of All Time Brett Kavanaugh's 1985 Bar Brawl Brings His Honesty Under Oath Into Question | The Daily Show Best of Insane Champion Picks That Actually Worked (League of Legends) The incredible inventions of intuitive AI | Maurice Conti 5 of the World's Most Dangerous Chemicals Joseph Stalin: The Red Terror

    Post a Comment