The “Chad” bug

December 30, 2015, 6:44 pm

≫ Next: Make a Doom level, part 2: design

≪ Previous: WebSockets: caution required

The "chad bug".

The Hangouts Dialer on Android absolutely REFUSES to find 2 of my contacts. I have 132 of them in the group "My Contacts" and all of them have phone numbers. Wanting to troubleshoot I tried searching for them one by one in Dialer: it can find 130 but not these 2.

I exported the contacts and looked at the raw Google CSV data. One of the 2 problematic contacts had a whitespace character at the end of its phone number. I removed it. Bingo, Dialer can now find it!

The other contact, "Chad", has no whitespace though. Why can Dialer not find it? All this contact has is a name, a phone number, and an email address. I tried everything: deleting it, recreating it from Android's contact editor, recreating it from the desktop at https://contacts.google.com, clearing the cache and data from Hangouts, etc. Nothing works, except one thing:

If I remove the email address from this contact... Dialer can find it!

I add the address back... Dialer cannot find it.

I replace "chad@" with "chad2@" in the address... Dialer can find it.

I add multiple addresses along with the real one... Dialer cannot find it.

It is as if Dialer is banning or blocking this specific user based on his email address. Did I block this user by accident (Hangouts options -> Blocked people)? Nope. Did I hid him by accident (Hangouts options -> Settings -> $my_account -> Hidden Contacts)? Nope.

My version of Hangouts is the latest (6.1.109448852). Ditto for Hangouts Dialer (0.1.100944346). My phone is running Android 5.0.1. I only have 1 Google account on this phone. Nothing fancy.

Please Hangouts team, fix this chad bug.

↧

Make a Doom level, part 2: design

December 31, 2015, 12:23 am

≫ Next: Do the math on your stock options

≪ Previous: The “Chad” bug

Part 1: the basics· Part 2: design· Part 3: cheating

I assume you’ve read the introduction, which tells you the basics of putting a world together.

This post is more narrative than mechanical; it’s a tour of my thought process as I try to turn my previous map into something a little more fun to play. I still touch on new editing things I do, but honestly, you already know the bulk of how to use an editor. Poke through SLADE’s keybindings (Edit → Preferences → Input) to see what hidden gems it has, click that “Show All” checkbox in the prop panel, and go wild. But please do comment if I blatantly forgot to explain something new.

(Fair warning: NVidia’s recent Linux drivers seem to have a bug that spontaneously crashes programs using OpenGL. SLADE is one such program. So if any of the screenshots seem to be slightly inconsistent, it’s probably because the editor crashed and I had to redo some work and it didn’t come out exactly the same.)

I have to say upfront: I’m far from being an expert on design. I haven’t even released a Doom map, aside from the one attached to the previous post. I have no qualifications whatsoever. That means we can learn about it together!

I admit also that my initial design instincts are terrible. I want to make lots of flat rectangles, aligned to the grid. It turns out that’s not very interesting. I wish I still had a copy of the very first map I made, some fifteen years ago now: every room was rectangular, every hallway was 64×64, every encounter had monsters packed in so neatly that they couldn’t even move sometimes.

I guess tidiness, neatness, and regularity don’t make for very interesting map. So what does? I went hunting for some answers.

Romero's rules

John Romero, who created episode 1 of the original Doom, had a literal set of rules that I’m just going to paste straight from the Doom wiki:

always changing floor height when I wanted to change floor textures
using special border textures between different wall segments and doorways
being strict about texture alignment
conscious use of contrast everywhere in a level between light and dark areas, cramped and open areas
making sure that if a player could see outside that they should be able to somehow get there
being strict about designing several secret areas on every level
making my levels flow so the player will revisit areas several times so they will better understand the 3D space of the level
creating easily recognizable landmarks in several places for easier navigation

That’s some good stuff, and you can see it in the episode 1 maps. (If you’re not familiar, here are some projections of them.) There are lots of loops, big unique central areas, windows that look into other places. Truly, this is the formula for a good map.

But, hang on. Romero only did episode 1 of the original Doom, and half a dozen maps in Doom II. Who is responsible for the others?

Sandy Petersen

Yeah, him.

Sandy Petersen is a Lovecraft-inspired madman who created the entire other two episodes of Doom in ten weeks. He also did more than half of Doom II, which is of particular interest to me, since that’s where my nostalgia lies. A quick perusal of the levels he designed reveals that all the ones that stick out to me are Sandy’s.

And yet, according to Masters of Doom,

His levels were not nearly as aesthetically pleasing as Romero’s; in fact, some of the id guys thought they were downright ugly, but they were undeniably fun and fiendish.

You might even argue that while Romero’s levels were elegant and flowed around each other, Sandy’s were bizarre mishmashes. Slough of Despair is just a big hand. Tricks and Traps is eight different rooms containing different weird little gimmicks. Downtown is a group of unconnected buildings with various ideas in them. So is Suburbs. And so is Industrial Zone— but that one was actually Romero! Surprise:

I loved it most when I’d try some weird experimental thing. Then John Carmack would berate me for stretching the engine too far. Then Romero, McGee, and Green would do a bunch of levels imitating it, because they liked it. Then John Carmack would change the engine. One good example was when I did a whole outdoors level, set in a city. Then everyone else had to make one.

I love this story. Sandy just went and did something bizarre, and it was unique enough that it inspired everyone else. Game culture tends to talk derisively about “gimmicks”, but I think a well-done gimmick is a fabulous thing.

Alas, I’m not sure this helps me — “do something bizarre” is not very concrete advice.

It's art, dummy

In an attempt to absorb some of whatever made Doom II stand out to me, I started a project of reimagining individual levels, but larger and with fancy ZDoom features. I was explaining this to my artist partner Mel, and went to show them the first level of Doom II.

They immediately commented about how everything in this opening shot draws the player’s eye to where they’re supposed to go. The hallway is the brightest thing on the screen, and the gradient of light makes it seem to glow. The walls are a bright green in an otherwise grungy brown room. The corners of the triangular steps literally point you in that direction. The two pillars frame the whole scene.

If this were a painting rather than an 3D world, well, you could do worse.

This is an incredible revelation that I still haven’t fully wrapped my head around. Level design requires the same kind of composition that goes into any other kind of art. You want landmarks, so the viewer knows what’s important and what’s not. You want hidden details, to reward viewers who pay closer attention. You want things to connect together and be revisited, so the world seems to evolve as the viewer spends more time with it. You want variation, so the viewer can tell the “background” from the “foreground”.

These are the same principles you might apply to any visual work, just interpreted differently. You can find the same ideas in good novels, or even non-fiction, or even this very blog post — landmarks, details, connections. so for the most part, it’s really all about…

Contrast, contrast, contrast

Contrast is how we make any sense of the world. We look for differences and carve stuff up into groups based on those differences. Order is boring — and sometimes boring is what you want — but contrast is interesting.

Almost all of Romero’s rules are about managing contrast. Contrast between floor height. Contrast between textures. Contrast between light and dark. Contrast between cramped and open. Contrast between required, optional, and secret. Contrast between your initial perception of an area and the way you understand it when you revisit. Contrast between landmarks and filler.

You don’t want too much contrast, of course, or the result will be chaotic and confusing — so the hard part is figuring out how to make effective use of contrast. It should guide the player around your space, emphasizing things that are important and filling in spaces with more subdued details.

Paint a picture, weave a story. Which reminds me:

Other influences

I’ve had a blast listening to Liz Ryerson’s Doom Mixtape series, in which she plays through a community map and just talks about its design and game culture. I hesitate to even call this a Let’s Play, since the voiceover isn’t so much about the map as about its relation to Doom modding, the Doom community, and the larger gaming community. It’s super interesting and I like to just play old episodes in the background while I’m fiddling with a map.

She often touches on narrative elements, which are something I realized I really love a few years ago. I don’t know if she’d put it this way, but I think of it as any details that make a world feel like it exists independently of the player, as opposed to being an obstacle course specifically carved out just for you. The feeling that there’s just stuff going on, that the universe isn’t centered on you, that interesting things would still be happening even if you weren’t here. A fantastic example is when the exit of one level continues smoothly into the beginning of the next; it makes the world feel so much more connected, rather than just a pile of one-off maps, but virtually no one does this.

In a way this is at odds with the Petersen approach to mapping — one of my complaints about the Doom II progression is that several of the early levels (supposedly uncorrupted regular Earth places) are so abstract as to be meaningless. What is Dead Simple, this isolated courtyard where the only goal is to kill all the monsters? What would it have been if Doomguy had never passed through? That kind of disconnected feeling fits much better in the later Hell levels, but it’s pretty jarring to see only two slots after The Waste Tunnels, which do a good job of suggesting… waste tunnels.

Through Doom Mixtape (possibly via YouTube’s autoplay?) I also stumbled upon Antroid’s blind Let’s Play of Doom II The Way Id Did, which is exactly what it sounds like. Doom II The Way Id Did was a community project to create a whole new set of maps that drew inspiration specifically from the way the original level designers approached the levels, so it’s pretty interesting to see someone who cares about design play through them all. The leader of the project (I think) also chimes in. Antroid cares about narrative too, which is great for me, though he and Liz are diametrically opposed on the hot topic of texture alignment.

Antroid has also LP’d Knee-Deep in Phobos, another map set that he had some thoughtful criticism on, and DTS-T which is a bit more goofy.

You may also be interested in this IGN interview with John Romero which takes place while Romero plays through the first episode of Doom, which he designed.

Right, yes, that would be nice. I’m going to keep building on top of the map I started last time, with the aim of turning it into a more respectable level. You may recall it looks like this:

I have a tiny outdoor area with lava, a hallway with the ambiguous “star” texture, and a gray brick room I didn’t even bother retexturing. All of this is entirely arbitrary and a matter of taste, of course, but several things strike me.

I already have three contrasting themes here, which could play against one another.
I have a place you revisit, but it’s just meaningless backtracking at the moment.
There aren’t any real landmarks to speak of, though of course the map is small enough not to need them.

I want to take the advantage I’ve already got and run with it, so I’ll just say I have a base built into a volcano, on top of an old tomb.

Does that make sense? Maybe not. But who cares? Doom is abstract — don’t worry too much about looking “real”. (I’m just gonna bold the more concrete tips I have.) If you wanted hyper-realistic detail, you’re probably using the wrong engine. Your goal here is to hint enough at a place that the player can fill in the gaps with their imagination, without ever consciously thinking about it. Trying too hard to create a “real” place may even backfire, if you develop a complex design in your head and then find out that Doom simply can’t express most of it.

So my first thought is to expand the outside area into a sort of volcanic crater. A really big chamber filled with lava should definitely work as a landmark. The red key can stay where it is, on a platform in the middle of the lava, except I want the platform to be a tall spire. I can figure out how to get there later.

Make the space bigger than you need! I always make everything too small (remember my ancient impulse to box everything in neatly), and I always regret it. It’s much easier to deal with extra space at the end than to keep having to create new space in the middle of a map.

All I did was draw a squiggly area here. When I’m reshaping an existing area, I like to go into vertex mode, make one of the existing lines horizontal or vertical, add a bunch of vertices all along that line, and then drag them around to shape the room. You can also just draw a new outline around it, of course, but that’s not as fun. (You can join sectors together by selecting them and pressing J. The final sector will have the properties of the first sector you selected.)

I put the red key spire at about the right height to catch the player’s eye when they walk outside. It would be more obvious if I changed it to the blue key, but I like that red fits the volcano theme out here. Keycards blink, so that should help.

With the raised ceilings in the middle, I made a conscious effort not to just redraw the outer edge, but smaller. That looks artificial. Instead I tried to make the inner shape a little smoother, and I aimed to put its vertices near the midpoints of some of the outer lines.

I think Romero said in that IGN interview that he details as he builds the map, but when I do that I get myself into trouble. Detailing is fun for sure, but if you change your mind about an area or need to move it around just a little bit, the details are a huge pain in the ass. At best, you have to destroy them all and recreate them. (Remember the crate!) Maybe it’d be easier if I were better at this and could be confident in my design from the get-go, but at least for now, I’m going to carve it out fairly rough and worry about making it pretty later.

This is still a dead end, alas. I do like that little nook in the southwest corner, a bit out of sight. Something else I struggle to remember is to not always make a whole room immediately visible or accessible. It feels so counter-intuitive — surely I want my design to be obvious and clear! But too obvious and too clear are also boring, as there’s nothing left for the player to explore. Also, more practically, there’s nowhere to hide monsters or secrets.

So I think I’m going to put an alcove in that nook. I want to have a switch, too, for affecting this volcano in some way. Otherwise there’s no point to going there!

Switches are a great thing. They give the player something to do, and they give the feeling that the world reacts to the player’s actions. This switch will probably be for progression, but sometimes it’s nice to have switches that aren’t particularly important, just for a bit of contrast. Doom II is full of doors and lifts that use switch textures right on the side, or rooms that open up once you press a readily-accessible switch.

That’s SW1GARG, if you’re wondering. I’m mostly using it for its rough metal background, which seems to fit this room. I didn’t want the switch to be a full 128 units high, so I made it about 72 high instead, and played with the switch texture’s vertical offset to get it to a nice height. I used METAL, a texture with two columns of rivets, for the sides and top.

That switch is a little too orthogonal, I think, so I’m going to switch to sector mode, select it, and use Edit Objects to rotate it 45°. That’ll give me a diagonal line 64 units long (the width of the switch texture), which would’ve been a huge pain to draw by hand.

Oops, that looks a little funny, since floor and ceiling textures follow the map grid. The player isn’t likely to see it in play, but I’m going to fix it anyway, by setting the floor rotation to 45. Rotating the floor actually rotates the entire map grid around (0, 0), so the texture was still a bit misaligned for me, and I had to play with the offsets a bit to make it look right. By default, the offsets are changed with the numpad — in increments of 8 with numlock on, and increments of 1 with it off. I have this rebound to the arrow keys for 8, or with Shift held down for 1. Also I’d like SLADE to be able to do this particular operation automatically, which maybe it will sometime.

Anyway! I haven’t made this switch actually do anything yet, but first, let’s figure out how the player gets over here. It seems reasonable that they might come through some tech stuff, but that’s a long way to walk, so I’ll put a little more cave too. I don’t know what’s going here yet; I’m just drawing some shapes.

I’m pretty tired of that gray brick floor texture. I want to replace the hallway with tiles (FLOOR3_3) that match the walls, and I’ll continue them on into that first big room.

I don’t want both rooms to have that floor, though. I’m also a little tired of this tan wall. So how about I make the other room STARGR2, the gray equivalent? Then I can use FLOOR0_6 for the floor. (Around this point you might benefit from changing the sort order in the texture dialog to “Usage Count”, which puts the textures you’ve used most frequently at the beginning of the list.)

Ah, but wait. One of Romero’s rules is that a change in floor texture means a change in floor height. I can get behind that. I need a change in floor height anyway, because I made my alcove a little higher than the starting floor. So I’ll add a few steps between the two rooms, and throw in a big door as well. I’ll even make the door bronze on one side and gray on the other, to match the room you enter into.

(You may notice I don’t ever say anything about ceilings. I’m half-convinced that the ceiling texture just doesn’t really matter. What ceiling does any area anywhere in Doom or Doom II have? Yeah, that’s what I thought. Valve has a rule of players don’t look up for good reason.)

I’ll also add the door on the other side, leading into the cave. The floor texture changes here, which means I need a change in floor height! I made the cave floor a little lower, so you step down out of the “building”. (Hint: you can draw lines to carve up an area however you want, then hop into sector mode and Delete some of the pieces. You’ll be left with a void surrounded by one-sided walls. If you carve too much, you can always rejoin sectors with J.)

That’s all well and good, but what do I put in these empty boxes? Ah, that’s the hard part.

This is the part where I start to feel really conspicuous, because it’s the part where I always get stuck. I can think up individual gimmicks, and I can roughly carve out some types of areas, sure. But the meat of a map is the series of spaces you move through, and I haven’t really wrapped my head around how to even approach designing such a space.

It’s a little awkward, then, that I find myself sitting here trying to give you advice on doing just that.

Well, let’s see. I want contrast. That’s pretty open-ended — anything might be contrast. What I really want are some building blocks that help to provide contrast. How do my favorite maps carve up spaces? Smaller structures in a larger space come to mind, with the extreme example being city maps. Also, raised walkways that you can’t reach initially.

Okay, that at least puts a basic idea in my head. I’m going to put a magma chamber in the middle of my room, and I’m going to put a walkway on either side of it to cut the room in half. The chamber will have doors on either side, but I’ll stick a monster in there so you can’t just barrel straight through. The walkway also gives me a place to stick some baddies.

I’ll draw some of these areas, then delete the extra space in the middle, leaving solid walls behind. Then I just need some doors and texture work. Remember to unpeg walls with holes cut into them, like the walls above and below the doors, and lower unpeg the door tracks. I’m using DOOR1 here, which is a little squat door 72 units tall, but you can of course do whatever.

I’m also sticking a small platform with a super shotgun in the middle of the room. I’m using CEIL1_2, which despite the name, is pretty commonly used as a floor texture for a raised square platform with an item on it. The walls are SUPPORT3, another very common go-to for raised metal platforms (like most teleporter pads).

When all is said and done…

You may notice that I’m using that same trick from last time to make the lava brighter than the room itself. I did it with the raised platform, too, though not as intensely.

Lighting is really important for adding atmosphere to Doom. Compare that opening shot of Doom II with the same thing in fullbright:

Wow! That looks like some hot garbage.

You can’t just rely on the engine to do it for you, either, because the engine… doesn’t. There is no casting of shadows in Doom, whatsoever. No dynamic lighting, no light sources at all. (This isn’t true in GZDoom, which actually makes several stock Doom objects cast their own light, but the effect is fairly minor so as to not ruin the deliberate lighting of Doom maps.) The only lighting you get is the light level of sectors. Even the light level of a wall is just the light level of the sector it faces.

So if you want to have a large outdoor area with some buildings casting shadows, you have to actually draw the shadows as separate sectors on the ground and make them darker. Even ZDoom’s fanciest lighting tricks can only give you slightly better tools for doing manual lighting, like separate floor and ceiling lighting. (In vanilla Doom, you might fake the lava trick by drawing a very thin outer lava sector that’s dark, so the walls are also dark, and then just making the inner area bright. The ceiling would also be bright, but oh well!) If you want a smooth lighting transition, well, you just have to draw a lot of thin sectors and give them all slightly different light levels.

I don’t have a lot else to say about lighting. Like everything else design-related, it’s just something you have to learn, and I’m still doing that. Look at maps you like, play around, see what works and what doesn’t.

Abrupt transition! The entrance of this map doesn’t make a lot of sense as I have it right now. The player starts in the middle of a hallway with monsters facing their back and most stuff in front of them. Doom’s spawn points often don’t make any sense, but this is particularly silly.

Well, that’s easy enough. I can just stick the player at the north end of the hallway, facing downwards.

Or… I could do something a little more interesting. I do like starting areas that give the impression I actually got here somehow. All that narrative stuff, remember. A dead-end hallway is not too great at establishing that feeling. Lots of Doom II levels just stick an unopenable DOOR2 behind you (which is weird since you don’t go out through that door at the end of any levels), but I can do better. Also I want to show you the sky hack.

This is a volcano, so I assume it has a side somewhere out there. I’ll say you climbed up the side and are facing the entrance of this weird volcano base. I guess I’ll start by drawing a squiggly area and sticking some textures on it. Then I’ll put a little building in one corner.

Hmm. This looks pretty goofy. Having all the walls of a room be the same height is certainly reasonably, but this is outside. What can I do about this?

Enter the sky hack, arguably the Doom engine’s only special effect. The sky hack is that when two neighboring sectors both use F_SKY1, the upper wall between them isn’t drawn.

This is kind of weird, so let me just do it and show you what happens. I’m going to draw a border of two sectors around this outside area. Both rings will have a floor height 64 units higher than the area’s floor. The outer ring’s ceiling will touch the floor (like a closed door), and the inner ring will have the same ceiling height as the area itself.

The last screenshot is exactly the same geometry, but with a different ceiling so you can see what the sky hack actually does. Those “missing” textures are the upper parts of the lines between the two rings, where the ceiling height changes. When both rings use the sky texture, the sky hack kicks in, and Doom doesn’t even try to draw those upper parts. It just lets the sky show through. Using this, we can create the illusion of a tall building surrounded by lower walls.

It’s not strictly necessary to even have two rings, but there are two advantages. One, if the player happens to catch a glance over the top of the shorter wall, they’ll see the floor of the inner ring, rather than an abrupt cut to sky. Two, it lets me extend the wall of the building beyond the wall of the “courtyard”, so it looks like it has some depth.

You may notice I just made the far outer wall a square, because it doesn’t actually matter — it’ll never appear to the player. I also marked those lines “Not On Map”, meaning they’ll never appear on the automap.

The sky hack has plenty of limitations, of course. If you need multiple buildings made of one-sided lines (because, say, they have doors in them), they’ll all generally have to be the same height: the true height of the outdoor area. And if you want a building shorter than the outer walls, you’re gonna have a bad time. Remember, the sky hack doesn’t actually make a wall “invisible”, it just draws the sky instead. So if you put a sector with a low sky inside a sector with a higher one… well, that doesn’t work out so well.

You might also think it would be nice to show the sides of the volcano behind the building, but I can’t do that, for the same reason — you can’t make them visible “above” the building, because you’re not actually looking above the building, you’re looking at the ceiling in front of it.

This will do for now, though. I’m moving the player start to the outdoor area, and we are good to go. Er, almost. I made the outdoor area much higher than the indoor area, to flimsily simulate being on the outside of the volcano, so we need a way down. I’m going to make a lift.

It’s pretty typical to use STEP1 or STEP2 as the base of a lift, so the player knows what it is. Similarly, the side of a lift (especially one you can “use” to lower) is often PLAT1, though anything obviously different from its surroundings works. I’m also going to have to make that one wall upper unpegged, since it’s an upper part of a wall with a hole (far below!) punched in it, surrounded by one-sided walls.

You can make lifts that lower when you step on them, but I find that kind of jarring. Instead, I want to have a switch that lowers this lift. I have a couple one-sided walls available, so I’m going to carve a little hole in one and make an inset switch.

To hook this up, we finally need to use a sector tag. Just give the lift sector a tag of 1, and then be sure to use that tag when wiring the switch. (You can use the “New Tag” button in the properties dialog, or the “…” button in the prop panel, to get a new tag you haven’t used before.)

I’m using Plat_DownWaitUpStay, which is the generic Doom-style lift. (The default Doom delay is 105 tics, or three seconds. I would love if SLADE told you this.) I’ll use the same special directly on the south side of the lift itself, so you can summon it down again. You can use either 0 or 1 for the sector tag here; just like doors, a sector tag of 0 means the sector on the other side of the line.

Now I can play the level and ride down my li—

Whoops! That’s the hall of mirrors effect, which you get when you forget a texture. I never gave this dividing line a lower texture, and SLADE didn’t warn me, because in the map’s initial state that part of the wall isn’t visible. (I would like SLADE to be cleverer about this, too!) I can fix this even in 3D mode if I want, by temporarily moving the lift down a bit. Lower unpegging keeps the texture aligned with the other walls, and now I have a lift!

Let’s recap. I now have several distinct areas (okay, two) with monster encounters in them, and this map doesn’t actually have any ammo. That might be nice to consider, but to know how much ammo to sprinkle around, we need to know a bit more about Doom’s monsters, and how much ammo it takes to kill them. Or we could just play the map and put more ammo in places where we run out, but this way has more numbers, and I do like numbers.

Doom II has nine weapons: fist (and berserk fist), chainsaw, pistol, shotgun, super shotgun, chaingun, rocket launcher, plasma gun, and BFG 9000. An interesting quirk of Doom’s loadout is that damage is randomized — every shot of every weapon has its damage multiplied by a random factor, which varies from 1d3 to 1d10. A more interesting quirk is that the shotguns actually fire multiple pellets, and each pellet has its damage randomized. So the super shotgun, with its impressive 20 pellets per shot, actually has the most consistent damage output (thanks to the law of large numbers).

With that as our baseline, you can very roughly describe all the weapons in terms of the super shotgun. All of the following do roughly the same amount of damage (200) on average, to within 10%, assuming you actually score a direct hit:

18 punches
2 berserk punches
18 chainsaw hits
20 pistol shots
3 shotgun shots
1 super shotgun blast
20 chaingun shots
1 rocket
9 plasma shots
½ BFG ball
2 BFG tracers (one shot fires 40 tracers)

You can draw a very rough comparison of ammo this way as well: 20 bullets ≈ 2 shells ≈ 1 rocket ≈ 10 cells.

Given that, here is the number of point blank super shotgun blasts it should take to kill each monster.

1 — zombie guy (20 HP), barrel (20 HP), shotgun guy (30 HP), ss nazi (50 HP), imp (60 HP), chaingun guy (70 HP), lost soul (100 HP), demon/spectre (150 HP)
2 — revenant (300 HP), cacodemon (400 HP), pain elemental (400 HP)
3 — hell knight (500 HP), arachnotron (500 HP), mancubus (600 HP)
4 — arch-vile (700 HP)
5 — baron of hell (1000 HP)
15 — spider mastermind (3000 HP)
20 — cyberdemon (4000 HP)

Monsters that have an exact multiple of 200 HP will need an extra shot about half the time. Since one shot does 200 damage on average. Which means it’s the middle. So half the time you’ll do less than that. Right. Plus you’re probably not going to hit with every single pellet every single time.

Using this, I can have a rough estimate of how much ammo I’ll need at bare minimum. A box of shells gives 20 shells, which should be able to take out either 10 or 20 minor enemies, depending on whether the player uses the shotgun or super shotgun. Shotgun guys drop a shotgun, which gives 8 shells, which can help kill a few more baddies. And so on. That all needs plenty of padding, of course, since running out of ammo in Doom is the worst possible thing.

Incidentally, that makes map sets kind of hard to balance! Almost any map will leave the player with plenty of ammo at the end, meaning they still have that ammo when they start the next level. But it’s considered polite to make maps playable from a “pistol start” — i.e., with only a pistol and 50 bullets, like you just started the game. How do you make both options equally viable and (more or less) equally challenging? I have no idea.

For now, I’m just going to stick a box of shells in the room with the magma chamber, and sprinkle a few shells (4 each) around the other areas. I’ll put a shotgun guy in the spawn area, too, facing away from the player, as a way to get a shotgun.

I don’t know yet how I’ll balance this per skill level, but I can treat this as medium difficulty (Hurt Me Plenty) and scale it up and down later.

Encounters

What about health? I can’t ballpark that as easily, since the amount of damage the player takes is entirely dependent on their skill level. Or, well, not entirely. It also depends on how the encounters are designed.

Doom’s combat is very very much about movement. Taking advantage of the terrain is incredibly important, and many monsters outright force you to do it: the revenant has homing rockets, the arch-vile has a line-of-sight attack, the mancubus fires a wide spread.

A curious feature of Doom’s combat is that most monsters are just not particularly difficult to kill without taking much damage, especially for experienced players. If you want an especially challenging encounter, you have to resort to some mild trickery. Ambushing the player is a classic move, though you’ll have to be clever nowadays, since everyone has seen monsters appear when they grab a key. Opening monster closets back the way the player came is certainly surprising, and can give the feel that somehow reinforcements appeared from nowhere. You can also force the player to fight in very close quarters, have monsters appear on both sides of them, or cut off their escape route. A few imps are much more dangerous in a cramped, dark room than in an open arena.

Spawning monsters is also an option. Typically that’s done by having a big closed-off room somewhere, filling it with monsters, and putting some “monster cross” lines that teleport to various places. Add a teeny tiny channel to connect that room to the rest of the level, so the monsters can hear the player. When they do, they’ll wake up, start milling around, and bumble over the teleport lines. (You can also literally spawn monsters with ZDoom’s scripting, but I don’t like to do that, since it means the “monsters remaining” count in the alt HUD is inaccurate.)

What? I never explained sound? There’s not much to it, really; if a monster hears the player use a weapon, it’ll wake up and start looking for the player. Sound travels between sectors freely, but does not pass through closed doors, which is why firing a shot on most maps doesn’t immediately wake up the whole world. That’s also why there are some teeny tunnels sprinkled throughout the stock maps — there’s one in MAP01 of Doom II, to let sound reach the secret room with the imps in it, so they’ll hear you and open the door to come out. You can fine-tune how sound spreads by marking lines as “blocks sound”, though keep in mind sound only stops after passing two such lines.

Okay, so, how much health? It’s up to you! I’m not sure there’s even such a thing as too much health; the player can only take so many hits from the stronger monsters anyway, and extra medikits don’t help once you’re dead. Doom II’s MAP21: Nirvana starts you out in a room with 20 medikits, every one that exists in the level. And that’s a map that consists mainly of imps and shotgun guys.

Feel free to be liberal with health and armor bonuses, especially. I love those. Everyone loves those.

Right, okay. So you go down the lift and enter this hallway, with a red door on the left and a volcano area on the right. At the end of the hallway is another room… and I’m gonna stick a door there so it’s not just exposed for all to see. You go in that room, you go through the magma chamber, you fight some dudes.

Hold up; this is sounding a bit too linear.

There are a couple different ways to think about linearity. Running straight through this room is obviously linear — at any given time, there’s only one thing you can do. We speak of games like Metroid as being “non-linear”, but every Metroid game still intends for you to acquire each new powerup in a specific order. In that case, the progression is still linear, but there are often multiple paths you can choose from — some will help you progress, some will be blocked off until you’ve progressed further, some will be optional areas, and some will be deliberate secrets. That’s the kind of nonlinearity Doom tends to have.

You can also take it a step further and have true nonlinearity. like MAP19: The Citadel. That map has three bars guarding the exit, each bar locked with a different key. All three keys are in the map, but you only need two of them to squeeze through the bars and reach the exit. So you can take multiple different routes through the map, skipping different areas depending on what you’re going for. That kind of design is much more difficult, of course, and I don’t think there’s another similar example in any canonical Doom level.

To make my map a little less linear, I’m going to add a little side room to the magma chamber, and put a switch there that opens the next door. I’ll also add a little pointless side room that has some supplies, because Doom is full of those, and I like them. (Seriously, you won’t believe how much of Doom and Doom II are completely optional. I’ve heard that this kind of thing isn’t very common in modern fan maps, which is sad. So please, put some little neat side areas in your map! It really helps with that whole narrative thing — there’s stuff in this world that doesn’t exist just to hurry you along to the exit.)

I’m just slopping this together and don’t claim it’s great, but I’ve tried to put a few good ideas in here. The side room, small as it is, still has two ways to go — the obvious one just leads you to the side of that platform, whereas the back way leads to some stairs. You can see the switch from the main room, so you know where you’re going. The computers blocking the middle of the room are at least moderately interesting.

I got that light pattern by using TLITE6_4 and setting the floor scale to 4 in both directions. You can do some pretty cool things with just the stock textures, by using bits and pieces of them in creative ways, and being able to scale them is super useful.

Note that I used Door_Open, which opens a door permanently. Probably what you want if you’re using a remote switch to open it.

Because the side room connects back to the main room, sound can pass freely into it. If I fired a shot in the main room like this, everything inside would wake up immediately! So I flagged all of those imps as “Ambush”, which means that they won’t start chasing the player just because they hear weapons fire. They are not deaf. The difference is subtle: if an ambush monster hears you, it’ll attack as soon as it can see you, even if it’s not facing you. A monster that hasn’t heard you at all won’t know you’re there until you step in front of it.

Whew! 7000 words in and we’ve made a whole three rooms. I’d better hurry this up.

Now I have another room to make interesting. Fuck it, I’m putting a lava chasm. I’ll have a switch you need to press, and a hell knight in the way. Imps on the other side will make life a little more uncomfortable.

What can a switch do to get you over a lava pool, you ask? Well, let me tell you about this nifty special called Floor_RaiseByValueTxTy. It raises a floor, and changes that floor’s texture and type (hence, TxTy) to match the floor it becomes level with. So the lava texture will change to a regular floor texture, and the 20% damage will disappear. Magic. (You can quickly tell how far it needs to rise by looking at a wall of the pit in 3D mode.) Remember, the raising floor will need some lower textures!

I’m also going to put a little teleporter alcove in the north end of the pit. There are two schools of thought here. One is that if you fall in a pit that’s obviously full of lava, that’s your own dumb fault. The other is that inescapable pits are just plain bad design, and every pit should have a way out. I’m going to go with the latter here, just so I can show you how teleporters work.

Make a 64×64 square, and give it one of the GATE* floor textures.
Make sure all its lines are facing outwards.
Put a “Teleport Destination” thing where you want the teleporter to lead. Keep in mind that the player will be facing the same way the destination thing points.
Give it a TID, a “thing id”, which is like a sector tag but for things. Sector tags and TIDs are different, so you can have both a sector tag of 1 and a TID of 1 and they’ll never interfere with each other.
Give all four sides of the teleporter the Teleport special, and make the first arg the TID you used for your teleporter destination. Make the lines “Repeatable” and “Player Cross”, of course.

And that’s it! Easy peasy.

If you want to be super fancy, you can make the pad flicker. Set its light level to something higher than the surrounding area, and give it the sector special “Light Strobe 1 Sec”. (Documentation for the light-related sector types is kind of atrocious, alas. The Doom wiki is the best I’ve found, though keep in mind ZDoom’s sector specials are 64 plus those numbers.) Now it will normally appear as dim as the surrounding sector, but every second it’ll flicker to its assigned light level. You can even prevent this from making the ceiling flicker, by setting the ceiling light level and checking “Ceiling Light Absolute” to make it independent of the sector lighting.

Here’s what I have now. (The player start is just there so I could test in ZDoom quickly.)

Got it? Rad.

I have a confession to make: I knew all along how you’d get the red key. I probably just gave it away, too: it’s a combination of raising a platform and making a teleporter.

I’m carving out room for a big ol’ ledge around the outside of my volcano, which will connect that larger outdoor area to a small teleporter alcove. Then I’ll link that teleporter to a teleporter on the red key spire, and vice versa, so you can teleport both ways. It’s no different; you just create two teleporters that happen to send you to the other. The teleport doesn’t trigger if you cross the line from the back, so the player can step off the teleporter with no problems.

Finally I’ll wire up that switch I made back in the beginning. Unfortunately, my volcano is really deep, and I need to raise it 320 units. SLADE currently won’t let you give arguments greater than 255, because it’s illegal in Hexen-format maps… even though it’s fine in UDMF maps. Oops.

Lucky for me, this is ZDoom, where there are at least two ways to do anything! So I’m going to use Generic_Floor instead, which will let me say the floor should raise to meet the next-highest neighboring floor. In my map, that’s the outdoor area you start in. (For now you have to check its arguments on the ZDoom wiki, but I’m gonna go make SLADE aware of them right after this.) So my target is 3, and my flags are 1 (copy texture, set type to 0) + 4 (copy from neighboring sector) + 8 (raise) = 15.

As you can see from the overlay there, I already made SLADE aware of the arguments and will be pushing that shortly. I am super on the ball.

With that, the map is completable again! I feel kinda bad that the exit area didn’t get touched yet, though, so I’m going to dig it down into a weird little tomb area and stick some… I dunno… revenants. They’re skeletons. Seems fitting.

Yep. Okay. Beautiful.

Lighting

Good lighting is hard, and I don’t even know where I’d start to make it better in this map. I varied it a little as I went, but it could be much better.

One thing I did do was make a slight change to the red door. Remember its fifth argument, “Light Tag”? You can give that a sector tag, and when the door opens or closes, the tagged sectors will fade between the door’s darkest neighbor (when closed) and brightest neighbor (when open). I tagged the sectors at the beginning of the tomb, so when the door opens, those sectors lighten gradually, as though the light were trickling in. It’s a pretty neat effect, even when subtle.

Hmm. I was saving the really good ZDoom trickery for the next part, but I’ll whet your appetite with one lighting thing that you simply cannot do in vanilla Doom. Remember this little cave with the teleporter?

The lava is very bright, but the walls are very dark. That goes for other places that have lava as well, but it’s particularly striking here.

It turns out there’s actually something we can do about this.

Draw a little sector out in the void, not connected to anything. It’s a good idea to keep it close to the cave, and mark its walls “Not On Map”. This is a control sector. It’s a junk throwaway sector, not really a part of the world. Control sectors exist so that specials can transfer some of their properties to other sectors.

I’m going to make this sector the same height as the cave (ctrlshiftC and ctrlshiftV to copy/paste properties are very useful here), but its ceiling will be somewhat lower than the actual cave’s ceiling. You can even do this in 3D mode, since you can walk through walls and fly around all you want.

The lava’s light level is 192; the cave’s light level is 128. I’ll make my sector’s light level 160, right in the middle. I also need to give the cave itself a sector tag.

Now the magic happens! Pick one of the walls of the control sector, and give it the ExtraFloor_LightOnly special (under “Renderer”). Give it the sector tag of the cave, and don’t worry about the “type”. And that’s it. You don’t need any triggers.

The effect is subtle, so I hope you can see it, but the tops of the walls are darker than the bottoms! The 160 light in the control sector was transferred to the walls of the cave, but only between the floor and ceiling of the control sector.

You can use this multiple times to give more than one “layer” of lighting to the walls of a sector, so if you were really determined, you could make a rough gradient here. All the properties of the control sector’s lighting are transferred, so you can use a sector special like “Light Flicker”, and have flickering light on only part of a wall. You can transfer colored or sloped lighting, too. Ooh, but I’m getting ahead of myself.

Textures

Let me show you the tomb in 3D mode, before and after I did some manual texture alignment.

Yes, manual. Auto-align is great, but it only gets you so far. In particular, it only does horizontal alignment (for now!), so you’re on your own with stairs. Aligning a wall with a floor or ceiling isn’t necessarily even possible, automated or not. But it looks so much better with the textures aligned, right?

Another texturing quandary: what do you do when you have an obviously tiling texture like METAL2 on diagonal walls, or other places that aren’t clear multiples of the texture width? You can fiddle with the geometry until it is a multiple, of course, and you can also use non-tiling textures to fill in the gaps. But there’s also a neat trick (which I picked up from Antroid’s DTS-T videos, of all places) that I already snuck into this post. Did you catch this?

Look at the side of that alcove. That’s just our old friend STARTAN2. But the round parts tile every 64 units, and this wall is only 16 units long. Why isn’t it cut off?

The secret is that I cut the wall in half! You’re actually seeing the left edge and right edge of a “round part”, mashed together in the middle. Doom is paletted, so a lot of textures can be spliced together like this without leaving an obvious seam. You can use the same trick to stretch METAL2 across a diagonal wall:

Here I split the wall into three segments. If you look very closely at the rivets, you can see where I did it, but the effect at a glance is still pretty nice. Of course, if you do this, you never want to use auto-align near those walls, and you’ll have to redo it if you change the geometry later. So probably best left for last.

Detailing

You can also use textures to break up the monotony of a long wall. Most textures in Doom have some kind of variant you can use for this. Take my lava chasm room, which has STARGR2 all the way around. I can make that more interesting just by plopping in a few vertices and changing parts of the wall to the sister texture STARGR1.

Check out the first hallway of Doom II’s MAP01, and you’ll find that every 64 units of the wall is a different texture.

You can also add “struts”, like I did around the switch alcove in the above screenshots. A long platform might want physical struts in the form of tiny square “voids” with a support texture. Just something to break up the monotony. Remember: contrast!

I’m under the impression that detailing is a teeny bit controversial in the Doom community at the moment, since a lot of mappers are kind of going overboard and making extremely detailed maps where every room needs five layers of trim and inset lights every three feet and all this weird nonsense. I, for one, don’t think you need all that much. I mentioned MAP21: Nirvana earlier; do you have any memories of it? Did it seem weird and complicated and confusing, or really give you any kind of feelings at all? Maybe you should check out its automap. I’m pretty sure my map already has more lines than half of Doom II’s maps.

And of course, I stress yet again: any kind of detailing is a pain in the ass to change once you’ve done it.

More alcoves

Yeah, sure. That little bit of enclosed cave, for example, could stand to at least have a forking bit somewhere. And while I’m in there, I want to cut it up and vary the height and lighting a bit. Make it more, you know, cavey.

One thing I try to do is lightly nudge the player towards the optional areas first. Otherwise, they may just continue down the “progression” path and never remember to come back and check out the alternatives. There are a couple biases you can try to take advantage of here — most people will be drawn to the closest option first, or in the case of a fork, will tend towards the right. So I’ll make a bit more room and put an alcove on the right side, with some goodies in it.

I don’t know if you’ve noticed, but I try to avoid having any perfectly horizontal or vertical walls in caves. The reason is that the vanilla Doom engine has a feature called fake contrast, which makes horizontal walls appear slightly darker and vertical walls appear slightly brighter. It works pretty well to accentuate sharp right angles, but with cave architecture it can make walls appear brighter for seemingly no reason.

The left screenshot has a slanted wall; the right screenshot is the same wall shifted slightly to be vertical. ZDoom lets you control this as a user setting, a map setting, or even per-wall in UDMF, but I find it easier to just not draw orthogonal lines in caves. It forces me away from drawing boxy areas, anyway.

There’s one other place I have my eye on — those raised platforms dividing the magma chamber room. It’s a hallmark of Doom that you can reach almost any area you can see, even monster walkways. I think I’ll make them a lift that’s activated by a switch on the back of the magma chamber. Then I can put a few goodies on the platforms too, or maybe swap an imp for one of the former humans.

Secrets

I love secrets. A lot of what I love in Doom II is its really bizarre secrets, many of them designed by Sandy Petersen. Even MAP01 has a secret involving a jump onto a seemingly irrelevant decoration.

I’m not feeling quite that cruel right now, but I do want to put a secret atop one of those lifts. I’ll use a very old Doom trope and hint at it by using a different texture. Making it an actual secret is pretty easy: just pop open the sector properties and check the “Secret” box on the “Special” tab.

In that middle screenshot, you can see another possible texturing trick: a single wall can have two different textures by putting a zero-height sector on the other side of it. In this case it’s a door, of course, but it could just be a dummy sector like I used for the sky hack.

There’s one more thing I need to do here. Check out the automap. I changed my automap colors back to traditional Doom here so the problem is more obvious.

The automap shows walls in red, but shows doors (actually, any change in ceiling height) in yellow. That completely gives the secret away! Luckily there’s an easy fix for this: give the line the “Secret” flag, which will make it show on the automap as if it were a one-sided wall.

There are other places you might want to use this flag; for example, even though I have the outer edges of the starting area marked “Not On Map”, the next set of edges are drawn in yellow. Because, of course, the ceiling height changes there. I think that looks goofy, so I’m flagging them as “Secret” as well. I like to have a tidy automap, hiding evidence of rendering tricks. Just be careful not to go overboard and make the automap useless or misleading.

Note that the iddt cheat, which reveals the whole automap, also shows all the lines marked “Not On Map”, so it’s not an accurate picture of what the player will see normally. Development versions of ZDoom have a console command, am_cheat 4, that will reveal the full automap but leave hidden lines hidden.

Decorations

A super duper easy way to make spaces a little more interesting is by sprinkling around some stock Doom decorations. Add in a few lighting effects, and you’re off to the races, whatever that means.

Oho, it looks like I improved on the starting area without telling you. I wonder how I did that.

Designing for multiplayer

Doom has two multiplayer modes: co-op and deathmatch. They need to be approached somewhat differently.

Co-op

Co-op is easy to make work, at least: just make sure you drop in player starts for the other players, 2 through 4. (ZDoom supports up to 8 players, if you’d like to do so as well.) There are probably more considerations than this, but offhand I can think of four major wrinkles that co-op adds.

Only one player can pick up any given ammo, health, armor, or dropped weapon. Weapons and keys do remain on the map, and every player can pick them up. (A player who already has a weapon can’t keep grabbing it in co-op and get infinite ammo.) So giving the player a shotgun by way of a shotgun guy doesn’t work so well in co-op. I can fix this by putting an actual shotgun in the starting area. That will look redundant in single-player, so I can just remove the “Single Player” flag, and it’ll only appear in co-op and deathmatch.
Ammo is split among players. This seems fine, since there are still the same amount of monsters overall. But in co-op, a player who dies respawns from a pistol start without restarting the map, so it’s easy to completely run out of ammo halfway through the map. It’s up to you how much you want to compensate for this, since giving tons of ammo may make a team of good players ridiculously overpowered. (Maybe that’s why several Doom II levels just have a cyberdemon hanging out near the spawn point in co-op.) Health and armor have similar problems.
Some puzzles might be much easier in co-op. Say you have a timing puzzle, where you press a switch and have barely enough time to run over to a secret lift. In co-op, one player could just wait by the lift while another presses the switch, which makes the puzzle trivial. Doom tends not to have very deep puzzles and this is probably not a big deal, but it’s worth keeping in mind.
Because players might be in separate areas of the map at any given time, and in particular a player might respawn from the start point, you have to be careful not to permanently block off parts of the map or otherwise risk separating players.

If you want your map to be great for co-op, rather than just not-broken, you can go much further. You might use decorations that only appear in co-op to block off a path, creating a new puzzle that only exists in co-op and requires actual cooperation to solve. You might up the number of monsters considerably. (Of course, you can’t distinguish between 2 players and 8, so don’t go overboard.) You might even create separate start areas for each player and make them work together to meet up.

Deathmatch

Deathmatch is a little different to think about. There are deathmatch spawn points, too (and you should have at least 4), but they tend to be sprinkled all over the level. Otherwise, uh, the players would just keep shooting each other in a tight area.

That can pose something of a problem for your design, since players may start midway through a level without some switch having been pressed yet. Players do spawn holding all the keys, so locked doors aren’t a problem, but other kinds of doors might be. The switch in my magma chamber, for example, is the only way to open the door to the gray room — so if a player starts in that gray room, they won’t be able to go back. In my case, I think that’s okay, since they could just continue forwards through the level and ultimately loop around. If this were a dead end, I’d need to do something like make the door openable normally from the inside.

Many maps have every weapon sprinkled around in deathmatch, and that’s easy enough to do by just removing their “Single Player” and “Cooperative” flags.

Deathmatch is also a good reason not to have too many dead ends in the first place, since a player won’t have anywhere to run when cornered. On the other hand, this can be a feature — I’m going to put the BFG9000 in the exit tomb, so that going to get it is somewhat of a risk.

There’s also an -altdeath deathmatch mode, in which most pickups respawn, so you don’t have to worry too much about ammo.

Sanity checks

SLADE has a “Map Checks” panel, which can find basic errors. It may also find a couple false positives in cases like untextured walls behind a sky hack.

Run through your level and make sure it works! You can move the player start when fiddling with a particular contraption, but there’s no substitute for actually playing through your own map from start to finish. I think Romero said in the IGN interview that he only played his maps from the beginning, so he’d get to know very well how they’d feel to a player. Your mileage may vary.

If you don’t want to get bogged down in fighting, you can always play with -nomonsters, which is one of the run configurations in SLADE. This is one excellent reason not to rely too strongly on effects that trigger when monsters die. (I haven’t shown you how to do this. It’s deliberate.) Another excellent reason is that it’s very common to play deathmatch with -nomonsters.

Triple-check that your doors actually work. I have a bad habit of creating and texturing a bunch of doors in a row, then forgetting to actually make them usable. Also, make sure they’re repeatable!

That’s all for now, but I am hard at work on part 3, in which we shall break all the rules. Or, a lot of the rules.

Here’s my version of this map. I even included a surprise, to encourage you to actually look at it in an editor. Send me yours, so I can put it in this list!

If you like when I write words, you can show your appreciation — and force me to write more often — by throwing a few bucks at my Patreon!

↧

Do the math on your stock options

December 30, 2015, 2:40 am

≫ Next: Most commonly used statistical tests and implementation in R

≪ Previous: Make a Doom level, part 2: design

Are you considering an offer from a private company, which involves stock options? Do you think those stock options might be worth something one day? Are you confused? Then read this! I’ll give you some motivation to learn more, and a few questions to consider asking your prospective employer.

I polled people on Twitter and 65% of them said that they’ve accepted an offer without understanding how the stock options work.

I have a short story for you about stock options. First: stock options are BORING AND COMPLICATED AND AWFUL. They are full of taxes, which we all know are awful. Some people think they’re fun and interesting to learn about. I am not one of those people. However, if you have an offer that involves stock options, I think you should learn a little about them anyway. All of the following assumes that you work for a private company that is still private when you leave it.

In this post I don’t want to explain comprehensively how options work. (For that, see how to value your startup stock options or The Open Guide to Equity Compensation ) Instead I want to tell you a story, and convince you to ask more questions, do a little research, and do more math.

I took a job 2 years ago, with a company with a billion-dollar-plus valuation. I was told “we pay less than other companies because our stock option offers are more generous”. Okay. I understood exactly nothing about stock options, and accepted the offer. To be clear: I don’t regret accepting the offer (my job is great! I ❤ my coworkers). But I do wish I’d understood the (fairly serious) implications at the time.

From my offer letter:

the offer gives you the option to purchase 114,129 shares of Stripe stock. [We bias] our offers to place weight on your ownership in the company.

I’m happy to talk you through how we think about the value of the options. As far as numbers: there are approximately [redacted] outstanding shares. We can talk in more detail about the current valuation and the strike price for your options.

This is a good situation! They were being pretty upfront with me. I had access to all the information I needed to do a little math. I did not do the math. Let me tell you how you can start with an offer letter like this and understand what’s going on a little better!

what the math looks like (it’s just multiplication)

The math I want you to do is pretty simple. The following example stock option offer is not at all my situation, but there are some similarities that I’ll explain in a minute.

The example situation:

stock options you’re being offered: 500,000
vesting schedule: 4 years. you get 25% after the first year, then the rest granted every month for the remainder of the time.
outstanding shares: 100,000,000 (the number of total shares the company has)
company’s current valuation: 1 billion dollars

This is an awesome start. You have options to buy 0.5% of the shares of a billion dollar company. What could be better? If you stay with the company until it goes public or dies, this is easy. If the company goes public and the stock price is more than your exercise price, you can exercise your options, sell as much of the stock as you want to, and make money. If it dies, you never exercise the options and don’t lose anything. win-win. This is where options excel.

However! If you want to ever quit your job (in the next 5 years, say!), you may not be able to sell any of your stock for a long time. You have more math to do.

ISOs (the usual way companies issue stock options) expire 3 months after you quit. So if you want to use them, you need to buy (or “exercise”) them. For that, you need to know the exercise price. You also need to know the fair market value (current value of the stock), for reasons that will become apparent in a bit. We need a little more data:

exercise price or strike price: $1. (This is how much it costs, per share, to buy your options.)
current fair market value: $1 (This is how much each share is theoretically worth. May or may not have any relationship to reality)
fair market value, after 3 years: $10

All this is information the company should tell you, except the value after 3 years, which would involve time travel. Let’s see how this plays out!

time to quit

Okay awesome! You had a great job, you’ve been there 3 years, you worked hard, did some great work for the company, you want to move on. What next? Since your options vested over 4 years, you now have 375,000 options (75% of your offer) that you can exercise. Seems great.

Surprise! Now you need to pay hundreds of thousands of dollars to invest in an uncertain outcome. The outcomes (IPO, acquisition, company fails) are all pretty complicated to discuss, but suffice to say: you can lose money by investing in the company you work for. It may be a good investment, but it’s not risk-free. Even an acquisition can end badly for you (the employee). Let’s see exactly how it costs you hundreds of thousands of dollars:

Pay the exercise price:

The exercise price is $1, so it costs $375,000 to turn your options into stock. Your options go poof in three months, but you can keep the stock if you buy it now.

What?! But you only have 300k in the bank. You thought that was… a lot. You make an amazing salary (even $200k/year wouldn’t cover that). You can still afford a lot of it though! Every share costs $1, and you can buy as many or as few as you want. No big deal.

You have to decide how much money you want to spend here. Your company hasn’t IPO’d yet, so you’ll only be able to make money selling your shares if your company eventually goes public AND sells for a higher price than your exercise price. If the company dies, you lose all the money you spent on stock. If the company gets acquired, the outcome is unpredictable, and you could still get nothing for all the money you spend exercising options.

Also, it gets worse: taxes!

Pay the taxes:

The value of your stock has gone up! This is awesome. It means you get the chance to pay a lot of taxes! The difference in value between $1 (the exercise price) and $10 (the current fair market value) is $9. So you’ve potentially made $9 * 375000 = 3.3 million dollars.

Well, you haven’t actually made that, since you’re buying stock you can’t sell (yet). But your local tax agency thinks you have. In Canada (though I’m not yet sure) I might have to pay income tax on that 3 million dollars, whether or not I have it. So that’s an extra 1.2 million in taxes, without any extra cash.

The tax implications are super boring and complicated, and super super important. If you work for a successful company, and its value is increasing over time, and you try to leave, the taxes can make it totally unaffordable to exercise your options. Even if the company wasn’t worth a lot when you started! See for instance this person describing how they can’t afford the taxes on their options. Early exercise can be a good defense against taxes (see the end of this post).

my actual situation

I don’t want to get too far into this fake situation because when people tell me fake situations, I’m like “ok but that’s not real why should I care.” Here’s something real.

I do not own 0.5% of a billion dollar company. In fact I own 0%. But the company I work for is valued at more than a billion dollars, and I do have options to buy some of it. The options I’m granted each year would cost, very roughly, $100,000 (including exercise prices + taxes). Over 4 years, that’s almost half a million dollars. My after-tax salary is less than $100,000 USD/year, so by definition it is impossible for me to exercise my options without borrowing money.

The total amount it would cost to exercise + pay taxes on my options is more than all of the money I have. I imagine that’s the case for some of my colleagues as well (for many of them, this is their first job out of school). If I leave, the options expire after 3 months. I still do not understand the tax implications of exercising at all. (it makes me want to hide under my bed and never come out)

I was really surprised by all of this. I’d never made a financial decision much bigger than buying a $1000 plane ticket or signing a lease before. So the prospect of investing a hundred thousand dollars in some stock? Having to pay taxes on money that I do not actually have? super scary.

So the possibilities, if I want to ever quit my job, are:

exercise them somehow (with money I get from ??? somewhere ???).
give up the options
find a way to sell the options or the resulting stock

There are several variations on #3. They mostly involve cooperation from your employer – it’s possible that they’ll let you sell some options, under some conditions, if you’re lucky / if they like you / if the stars are correctly aligned. This post How to sell secondary stock says a little more (thanks @antifuchs!). This HN comment describes a situation where someone got an offer from an outside investor, and the investor was told by the company to not buy from him (and then didn’t buy from him). Your employer has all the power.

Again, this isn’t a disaster – I have a good job, which pays me a SF salary despite me living in Montreal. It’s a fantastic situation to be in. And certainly having an option to buy stock is better than having nothing at all! But you can ask questions, and I like being informed.

Questions to ask

Stock options are very complicated. If you start out knowing nothing, and you have an offer to evaluate this week, you’re unlikely to be able to understand every possible scenario. But you can do better than me!

When I got an offer, they were super willing to answer questions, and I didn’t know what to ask. So here are some things you could ask. In all this I’m going to assume you work for a US company.

Basic questions:

how many stock options (# shares)
vesting schedule (usually 4 years / 1 year “cliff”)
how many outstanding shares
company’s current valuation
exercise price (per share)
fair market value (per share: a made-up number, but possibly useful)
if they’re offering ISOs, NSOs, or RSUs
how long after leaving do you have to exercise?

Then you can do some basic math and figure out how much it would cost to exercise the options, if you choose to. (I have a friend who paid $1 total to exercise his stock options. It might be cheap!)

More ambitious questions

As with all difficult questions, before you accept an offer is the best time to ask, because it’s when you have the most leverage.

will they let you sell stock to an outside investor?
If you can only exercise for 3 months after leaving, is that negotiable? (pinterest gives you the option of 7 years and worse tax implications. can they do the same?)
If the company got sold for the current valuation (2X? 10X?) in 2 years, what would my shares be worth? What if the company raises a lot of money between now and then?
Can they give you a summary of what stock & options other people have? This is called the “cap table”. (The reason you might want to know this: often VCs are promised that they’ll get their money first in the case of any liquidation event. Before you! Sometimes they’re promised at least a 3x return on their investment. This is called a “liquidation preference” .)
Do the VCs have participation? (there’s a definition of participation and other stock option terms here)
Can you early exercise your options? I know someone who early exercised and saved a ton of money on taxes by doing it. This guide talks more about early exercising.
Do your options vest faster if the company is acquired? What if you get terminated? (these possibilities are called “single/double trigger”)

If you have more ideas for good questions, tell me! I’ll add them to this list.

#talkpay

I think it’s important to talk about stock option grants! A lot of money can be at stake, and it’s difficult to talk about amounts in the tens or hundreds of thousands.

There’s also some tension about this topic because people get very emotionally invested in startups (for good reason!) and often feel guilt about leaving / discussing the financial implications of leaving. It can feel disloyal!

But if you’re trying to make an investment decision about thousands of dollars, I think you should be informed. Being informed isn’t disloyal :) The company you work for is informed.

Do the math

The company making you an offer has lawyers and they should know the answers to all the questions I suggested. They’ve thought very carefully about these things already.

I wish I’d known what questions to ask and done some of the math before I started my job, so I knew what I was getting into. Ask questions for me! :) You’ll understand more clearly what investment decisions might be ahead of you, and what the financial implications of those decisions might be.

Thanks for Leah Hanson and Dan Luu for editing help!

↧

Most commonly used statistical tests and implementation in R

December 31, 2015, 7:50 pm

≫ Next: Introducing Guesstimate, a Spreadsheet for Things That Aren’t Certain

≪ Previous: Do the math on your stock options

This chapter explains the purpose of some of the most commonly used statistical tests and how to implement them in R

1. One Sample t-Test

Why is it used?

It is a parametric test used to test if the mean of a sample from a normal distribution could reasonably be a specific value.

set.seed(100)
x <-rnorm(50, mean =10, sd =0.5)
t.test(x, mu=10) # testing if mean of x could be#=> One Sample t-test#=> #=> data:  x#=> t = 0.70372, df = 49, p-value = 0.4849#=> alternative hypothesis: true mean is not equal to 10#=> 95 percent confidence interval:#=>   9.924374 10.157135#=> sample estimates:#=> mean of x #=>  10.04075

How to interpret?

In above case, the p-Value is not less than significance level of 0.05, therefore the null hypothesis that the mean=10 cannot be rejected. Also note that the 95% confidence interval range includes the value 10 within its range. So, it is ok to say the mean of ‘x’ is 10, especially since ‘x’ is assumed to be normally distributed. In case, a normal distribution is not assumed, use wilcoxon signed rank test shown in next section.

Note: Use conf.level argument to adjust the confidence level.

2. Wilcoxon Signed Rank Test

Why / When is it used?

To test the mean of a sample when normal distribution is not assumed. Wilcoxon signed rank test can be an alternative to t-Test, especially when the data sample is not assumed to follow a normal distribution. It is a non-parametric method used to test if an estimate is different from its true value.

numeric_vector <-c(20, 29, 24, 19, 20, 22, 28, 23, 19, 19)
wilcox.test(numeric_vector, mu=20, conf.int =TRUE)
#>  Wilcoxon signed rank test with continuity correction#>#> data:  numeric_vector#> V = 30, p-value = 0.1056#> alternative hypothesis: true location is not equal to 20#> 90 percent confidence interval:#>  19.00006 25.99999#> sample estimates:#> (pseudo)median #>       23.00002

How to interpret?

If p-Value < 0.05, reject the null hypothesis and accept the alternate mentioned in your R code’s output. Type example(wilcox.test) in R console for more illustration.

3. Two Sample t-Test and Wilcoxon Rank Sum Test

Both t.Test and Wilcoxon rank test can be used to compare the mean of 2 samples. The difference is t-Test assumes the samples being tests is drawn from a normal distribution, while, Wilcoxon’s rank sum test does not.

How to implement in R?

Pass the two numeric vector samples into the t.test() when sample is distributed ‘normal’y and wilcox.test() when it isn’t assumed to follow a normal distribution.

x <-c(0.80, 0.83, 1.89, 1.04, 1.45, 1.38, 1.91, 1.64, 0.73, 1.46)
y <-c(1.15, 0.88, 0.90, 0.74, 1.21)
wilcox.test(x, y, alternative ="g")  # g for greater#=> Wilcoxon rank sum test#=> #=> data:  x and y#=> W = 35, p-value = 0.1272#=> alternative hypothesis: true location shift is greater than 0

With a p-Value of 0.1262, we cannot reject the null hypothesis that both x and y have same means.

t.test(1:10, y =c(7:20))      # P = .00001855#=> Welch Two Sample t-test#=> #=> data:  1:10 and c(7:20)#=> t = -5.4349, df = 21.982, p-value = 1.855e-05#=> alternative hypothesis: true difference in means is not equal to 0#=> 95 percent confidence interval:#=>   -11.052802  -4.947198#=> sample estimates:#=> mean of x mean of y #=>       5.5      13.5

With p-Value < 0.05, we can safely reject the null hypothesis that there is no difference in mean.

What if we want to do a 1-to-1 comparison of means for values of x and y?

# Use paired = TRUE for 1-to-1 comparison of observations.t.test(x, y, paired =TRUE) # when observations are paired, use 'paired' argument.wilcox.test(x, y, paired =TRUE) # both x and y are assumed to have similar shapes

When can I conclude if the mean’s are different?

Conventionally, If the p-Value is less than significance level (ideally 0.05), reject the null hypothesis that both means are the are equal.

4. Shapiro Test

Why is it used?

To test if a sample follows a normal distribution.

shapiro.test(numericVector) # Does myVec follow a normal disbn?

Lets see how to do the test on a sample from a normal distribution.

# Example: Test a normal distributionset.seed(100)
normaly_disb <-rnorm(100, mean=5, sd=1) # generate a normal distributionshapiro.test(normaly_disb)  # the shapiro test.#=> Shapiro-Wilk normality test#=>#=> data:  normaly_disb#=> W = 0.98836, p-value = 0.535

How to interpret?

The null hypothesis here is that the sample being tested is normally distributed. Since the p Value is not less that the significane level of 0.05, we don’t reject the null hypothesis. Therefore, the tested sample is confirmed to follow a normal distribution (thou, we already know that!).

# Example: Test a uniform distributionset.seed(100)
not_normaly_disb <-runif(100)  # uniform distribution.shapiro.test(not_normaly_disb)
#=>     Shapiro-Wilk normality test#=> data:  not_normaly_disb#=> W = 0.96509, p-value = 0.009436

How to interpret?

If p-Value is less than the significance level of 0.05, the null-hypothesis that it is normally distributed can be rejected, which is the case here.

5. Kolmogorov And Smirnov Test

Kolmogorov-Smirnov test is used to check whether 2 samples follow the same distribution.

ks.test(x, y) # x and y are two numeric vector

# From different distributions
x <-rnorm(50)
y <-runif(50)
ks.test(x, y)  # perform ks test#=> Two-sample Kolmogorov-Smirnov test#=> #=> data:  x and y#=> D = 0.58, p-value = 4.048e-08#=> alternative hypothesis: two-sided

# Both from normal distribution
x <-rnorm(50)
y <-rnorm(50)
ks.test(x, y)  # perform ks test#=> Two-sample Kolmogorov-Smirnov test#=> #=> data:  x and y#=> D = 0.18, p-value = .3959#=> alternative hypothesis: two-sided

How to tell if they are from the same distribution ?

If p-Value < 0.05 (significance level), we reject the null hypothesis that they are drawn from same distribution. In other words, p < 0.05 implies x and y from different distributions

6. Fisher’s F-Test

Fisher’s F test can be used to check if two samples have same variance.

var.test(x, y)  # Do x and y have the same variance?

Alternatively fligner.test() and bartlett.test() can be used for the same purpose.

7. Chi Squared Test

Chi-squared test in R can be used to test if two categorical variables are dependent, by means of a contingency table.

Example use case: You may want to figure out if big budget films become box-office hits. We got 2 categorical variables (Budget of film, Success Status) each with 2 factors (Big/Low budget and Hit/Flop), which forms a 2 x 2 matrix.

chisq.test(table(categorical_X, categorical_Y), correct =FALSE)  # Yates continuity correction not applied#orsummary(table(categorical_X, categorical_Y)) # performs a chi-squared test.# Sample results#=> Pearson's Chi-squared test#=> data:  M#=> X-squared = 30.0701, df = 2, p-value = 2.954e-07

How to tell if x, y are independent?

There are two ways to tell if they are independent:

By looking at the p-Value: If the p-Value is less that 0.05, we fail to reject the null hypothesis that the x and y are independent. So for the example output above, (p-Value=2.954e-07), we reject the null hypothesis and conclude that x and y are not independent.
From Chi.sq value: For 2 x 2 contingency tables with 2 degrees of freedom (d.o.f), if the Chi-Squared calculated is greater than 3.841 (critical value), we reject the null hypothesis that the variables are independent. To find the critical value of larger d.o.f contingency tables, use qchisq(0.95, n-1), where n is the number of variables.

8. Correlation

Why is it used?

To test the linear relationship of two continuous variables

The cor.test() function computes the correlation between two continuous variables and test if the y is dependent on the x. The null hypothesis is that the true correlation between x and y is zero.

cor.test(x, y) # where x and y are numeric vectors.

cor.test(cars$speed, cars$dist)
#=> Pearson's product-moment correlation#=> #=> data:  cars$speed and cars$dist#=> t = 9.464, df = 48, p-value = 1.49e-12#=> alternative hypothesis: true correlation is not equal to 0#=> 95 percent confidence interval:#=>   0.6816422 0.8862036#=> sample estimates:#=>       cor #=> 0.8068949

How to interpret?

If the p Value is less than 0.05, we reject the null hypothesis that the true correlation is zero (i.e. they are independent). So in this case, we reject the null hypothesis and conclude that dist is dependent on speed.

9. More Commonly Used Tests

fisher.test(contingencyMatrix, alternative ="greater")  # Fisher's exact test to test independence of rows and columns in contingency tablefriedman.test()  # Friedman's rank sum non-parametric test

There are more useful tests available in various other packages.

The package lawstat has a good collection. The outliers package has a number of test for testing for presence of outliers.

↧

Introducing Guesstimate, a Spreadsheet for Things That Aren’t Certain

December 30, 2015, 10:44 pm

≫ Next: Haskell Game Server – Part 2

≪ Previous: Most commonly used statistical tests and implementation in R

Introducing Guesstimate, a Spreadsheet for Things That Aren’t Certain

Existing spreadsheet software is made for analyzing tables of data. Excel, Google Sheets, and similar tools are fantastic for doing statistics on things that are well known.

Unfortunately many important things are not known. I don’t yet know if I will succeed as an entrepreneur, when I will die, exactly how bad sugar is for me. No one really knows what the US GDP will be if Donald Trump gets elected, or if the US can ‘win’ if we step up our fight in Syria. But we can make estimates, and we can use tools to become as accurate as possible.

Estimates for these things should feature ranges, not exact numbers. There should be lower and upper bounds.

The first reaction of many people to uncertain math is to use the same techniques as for certain math. They would either imagine each unknown as an exact mean, or take ‘worst case’ and ‘best case’ scenarios and multiply each one. These two approaches are quite incorrect and produce oversimplified outputs.

This is why I’ve made Guesstimate, a spreadsheet that’s as easy to use as existing spreadsheets, but works for uncertain values. For any cell you can enter confidence intervals (lower and upper bounds) that can represent full probability distributions. 5000 Monte Carlo simulations are performed to find the output interval for each equation, all in the browser.

A simple example. Link: http://getguesstimate.com/models/193

At the end of this you don’t just understand the ‘best’ and ‘worst’ scenarios, but you also get everything in between and outside. There’s the mean, the median, and several of the percentiles. In the future I intend to add sensitivity analyses and the value of information calculations.

Guesstimate is free and open source. I encourage you to try it out. Make estimates of things you find important or interesting. I’m looking forward to seeing what people come up with.

↧

Haskell Game Server – Part 2

December 31, 2015, 10:02 pm

≫ Next: } // good to go

≪ Previous: Introducing Guesstimate, a Spreadsheet for Things That Aren’t Certain

Posted on January 1, 2016

Part 1 - Networking & Message Protocols
Part 1 Followup - A brief followup to my original post.

Topics

In today’s post I will cover which messages we shuffle between the server and clients and their purpose, how the game world is fundamentally managed, and how we use an Actor class to help manage scene objects.

Messages Between Server & Clients

We’ve built our game so that the server only sends data the player AI has strictly requested, with a few minor exceptions such as game state changes. The goal is to allow the player to drive as much of their robot’s game play as possible through their AI decisions, of which is when and what to actually query.

There are three primary categories of messages:

We use techniques such as message size checking and API rate limiting to prevent people from abusing the server.

Configure Messages

When the server is in the initial ConfigureState the player AI can configure their robot from a large selection of chassis, weapons, sensors, counter measures, reactors, engines, and more. We liked the idea of giving players the ability to do this configuration in real time, especially if we add another reconnaissance phase giving them varying details about their enemy configurations, and the ability to save and load data about past matches should they desire to do so.

A configuration message may look something like this, where you simply define the documented model of a component to use:


data SlugConfigureWeaponMessage
  = SlugConfigureWeaponMessage {
      weaponModel    :: Required 1 (Value String)
    , capacitorModel :: Optional 2 (Value String)
    , ammoModel      :: Required 3 (Value String)
    }
  deriving (Generic, Show)

The server then processes these configuration messages and makes sure that the correct amount of overall weight is available to add these components, and that you can actually plug a component into its receiving component. Internally we call this a component protocol which defines the receiving sockets and incoming plugs it can handle. For example, a specific arm model may only receive a total of two weapons, one of which is ballistic, the other a beam weapon, with no limitations. Another arm model may only accept one missile weapon from a specific list of manufacturers. This gives us a lot of power to mix, match, and limit which components can plug into what.

The very last action taken is a SlugConfigureDoneRequest message which simply signals that the AI is finished with their configuration. If for any reason an AI does not configure itself before the end of the phase they are automatically assigned a default robot. One can also optionally pass the fact that they want to use the default robot in the done request as well.

Query Messages

Once the server is in the GameState the AI can then begin querying information about their robot. Right now the primary query about one’s robot is a large monolithic protobuf message.


data SlugGetQueryWarMachineResponse
  = SlugGetQueryWarMachineResponse {
      state          :: Required 1 (Value String)
    , maxWeight      :: Required 2 (Value Int32)
    , currentWeight  :: Required 3 (Value Int32)
    , hasStorage     :: Required 4 (Value Bool)
    , storage        :: Optional 5 (Message SlugQueryStorageMessage)
    , hasCapacitor   :: Required 6 (Value Bool)
    , capacitor      :: Optional 7 (Message SlugQueryCapacitorMessage)
    , armCount       :: Required 8 (Value Int32)
    , legCount       :: Required 9 (Value Int32)
    , position       :: Required 10 (Message VectorMessage)
    , rotation       :: Required 11 (Message VectorMessage)
    , rotationRate   :: Required 12 (Value Float)
    , rotationTarget :: Required 13 (Value Float)
    , reactor        :: Required 14 (Message SlugQueryReactorMessage)
    , torso          :: Required 15 (Message SlugQueryTorsoMessage)
    , cockpit        :: Required 16 (Message SlugQueryCockpitMessage)
    , arms           :: Repeated 17 (Message SlugQueryArmMessage)
    , legs           :: Repeated 18 (Message SlugQueryLegMessage)
    }
  deriving (Generic, Show)

Each Message gives further details about the additional components inside the robot, such as data acquired by sensors, the firing and ammunition state of a weapoon, etc. The player AI can then react on these values by doing things such as moving, rotating, firing weapons, communicating with teammates, and much more.

Another slightly smaller message defines data about enemy robots which were scanned by the sensors, with varying levels of detail based on how powerful the robot’s equipped computer is.

As you can see we call our robots/mechs War Machines.

Commit Messages

Interacting with the actual war machine is done with very simple messages such as the following:


data SlugSetCommitEngineRequest
  = SlugSetCommitEngineRequest {
      state        :: Optional 1 (Value Int32)
    , acceleration :: Optional 2 (Value Float)
    }
  deriving (Generic, Show)

data SlugSetCommitArmWeaponRequest
  = SlugSetCommitArmWeaponRequest {
      weaponPosition :: Required 1 (Value Int32)
    , armPosition    :: Required 2 (Value Int32)
    , state          :: Optional 3 (Value Int32)
    , fireState      :: Optional 4 (Value Int32)
    }
  deriving (Generic, Show)

These allow the AI to change the state of a component (maybe they wish to power it down to utilize the additional reactor power for another component), and perform specific actions with the component. In the above example setting the acceleration of the engine will begin moving the war machine. Setting the fire state to Reloading will force the weapon to do a reload.

Game World Management

This is one of the more interesting parts of the game server where we break away from the impure IO world and hand everything off into the pure simulation world. The World state is stored inside a TVar so we can make use of Haskell’s brilliant STM features to handle thread safety for us.

We store all relevant data using IntMaps, although I’m not sure if there is a better way to store game objects of various types. Currently it seems the best method is to just make a map for each data type, of which we only have a few.


type PlayerMap  = IntMap.IntMap Player
type ChassisMap = IntMap.IntMap Chassis
type AmmoMap    = IntMap.IntMap Ammo

data World
  = World {
      _worldState          :: !WorldState     -- track the phases the world is in
    , _worldCounter        :: !ObjectId       -- a simple incremented counter for object ids
    , _worldPlayers        :: !PlayerMap      -- track player api call rates and the like
    , _worldChassis        :: !ChassisMap     -- war machines in the world
    , _worldAmmo           :: !AmmoMap        -- projectiles currently in the world
    , _worldConfigurations :: !Configurations -- yaml loaded war machine data for configuration phase
    , _worldDefaultChassis :: !Chassis        -- the default chassis
    }

makeLenses ''World

The game loop looks like this:


{-# LANGUAGE BangPatterns #-}

runWorld :: Server -> GameTime -> UTCTime -> IO ()
runWorld server !time before = do
  now <- getCurrentTime
  let dt = diffTime now before
  world0 <- readTVarIO (server^.serverWorld)
  when (world0^.worldState == GamePhase) $ do
    sim0 <- atomically $ iterateWorld server time dt
    mapM_ (broadcastSimulationMessages server) (sim0^.simulationMessages)
  threadDelay loopWait
  let next = time + dt
  runWorld server next now
  where
    diffTime = (realToFrac .) . diffUTCTime

I use BangPatterns because I want to force the next time value to be evaluated each tick. While profiling for an unrelated space leak it turns out let next = time + dt was causing a small heap space leak because there are cases where the time value never gets evaluated downstream for awhile.

The main loop itself runs at about 60 updates per second. This is good enough to smoothly move the simulation state forward and notify any listening external clients of the action going on.


iterateWorld :: Server -> GameTime -> DeltaTime -> STM Simulation
iterateWorld server time dt = do
  world0 <- readTVar (server^.serverWorld)
  let sim0  = Simulation [] [] [] NoChassisState
  let (world1, sim1) = runState (stepWorld time dt world0) sim0
  writeTVar (server^.serverWorld) world1
  return sim1

One of the more interesting problems we faced was how to get a bunch of simulation state updates generated deep inside game objects and get them to percolate back up to IO so we can send them out to external clients. We handled this by utilizing the State monad with which we thread a Simulation data type through each run* function that collects all outgoing movement and rotation messages, as well as notifies the World that objects have spawned (such as a weapon firing a projectile), etc.

The World iteration function is basically this:


stepWorld :: GameTime -> DeltaTime -> World -> MTL.State Simulation World
stepWorld t dt world0 = do
  sim0 <- get
  -- a lot of processing steps happen here
  put sim1
  return world1

Where each component inside the world that gets iterated over is also passed the Simulation state so it can append any new messages to be acted upon.

Actor Class

The final topic of this post is about our Actor class. This is quite useful for providing a common interface which game objects must implement so that they can move, rotate, or collide.

The code is so incredibly dense I’ll post the entire thing here.


module Game.Actor where

import           Linear
import           Control.Lens

import           Game.Types
import           Game.Utils

class Actor a where
  getId         :: a -> ObjectId
  setId         :: ObjectId -> a -> a
  getObjectType :: a -> ObjectType
  setObjectType :: ObjectType -> a -> a
  getPosition   :: a -> Position
  setPosition   :: Position -> a -> a

class (Actor a) => Movable a where
  getAcceleration :: a -> Acceleration
  setAcceleration :: Acceleration -> a -> a
  getVelocity     :: a -> Velocity
  setVelocity     :: Velocity -> a -> a
  getVelocityMax  :: a -> Velocity

class (Actor a) => Rotatable a where
  getRotation       :: a -> Rotation
  setRotation       :: Rotation -> a -> a
  getRotationRate   :: a -> Float
  setRotationRate   :: Float -> a -> a
  getRotationTarget :: a -> Float
  setRotationTarget :: Float -> a -> a

class (Actor a) => Collideable a where
  getCollider :: a -> Collider
  setCollider :: Collider -> a -> a

moveActor :: (Movable a, Rotatable a) => DeltaTime -> a -> a
moveActor dt m =
  setVelocity vel . setPosition pos $ m
  where
    -- velocity
    vel = if getVelocity m < getVelocityMax m
          then getVelocity m + getAcceleration m * dt
          else getVelocity m
    -- position
    dir = rotate (getRotation m) (newPosition 0.0 0.0 1.0) -- forward on Z vector
    pos = getPosition m + (vel * dt *^ dir)

rotateActor :: (Rotatable a) => DeltaTime -> a -> a
rotateActor dt r =
  setRotation rot r
  where
    mid = axisAngle (newPosition 0.0 1.0 0.0) (toRadians $ getRotationTarget r)
    rot = slerp (getRotation r) (getRotation r * mid) (dt * getRotationRate r)

isColliding :: (Collideable a, Collideable b) => a -> b -> Bool
isColliding aA aB = do
  let (cAr, cAh) = getColliderValues (getCollider aA)
  let (cBr, cBh) = getColliderValues (getCollider aB)
  let cPa = getPosition aA
  let cPb = getPosition aB
  abs (distance (cPa^._xz) (cPb^._xz)) <= cAr + cBr && abs ((cPa^._y) - (cPb^._y)) <= cAh + cBh

Not all objects are created equal, so we provide more strict definitions of Actor as well. A building will clearly not be Moveable nor Rotatable, while a base turret may be Rotatable but not Moveable.

Our isColliding function simply does a naive cylinder collision check. We then take that data and calculate the direction the impact occurred, and in the case of a projectile impacting a war machine we then utilize that direction data to calculate which component the damage applies to, and which direction it came from so we can reduce the correct forward or rearward facing armor values. In the future we’d like to replace this with actual model data so we can get pixel perfect collisions, however, we think the naive approach will work fine until then.

Each game tick a Moveable/Rotatable object in the scene is ran with moveActor dt . rotateActor dt $ actor and if anything occurred the correct Simulation message is generated and passed up to the main loop. We also check each object against every other object in the scene to see if they collided and pass up relevant messages here as well.

Conclusion

We were able to split our game server code up so that all IO related tasks happen in Server while all the simulation logic occurs in the pure World. This makes it incredibly easy to reason about how all of our simulation logic runs and test and debug any logic issues we come across. We also use various types of protobuf messages which allow the player’s AI to interact with our server and control their war machines. Finally, we’ve created an Actor class which simulation objects must implement so that they can interact inside the game world. The logic for performing movement, rotations, and collision detection was incredibly easy, with many thanks to Edward A. Kmett and his powerful linear library.

I don’t have any plans for another post at this time, however, if there’s anything you’d like to hear more detail about please feel free to reach out to me over twitter.

Please enable JavaScript to view the comments powered by Disqus.

↧

} // good to go

December 31, 2015, 10:48 am

≫ Next: The Website Obesity Crisis

≪ Previous: Haskell Game Server – Part 2

Okay, let's see what we've got. Two sets of annotated training materials. Six books. Over four dozen online videos. Some 80 articles, interviews, and academic papers. A slew of blog entries, and more posts to Usenet and StackOverflow than you can shake a stick at. A couple of contributions to the C++ vernacular. A poll equating my hair with that of a cartoon character.

I think that's enough; we're good to go. So consider me gone. 25 years after publication of my first academic papers involving C++, I'm retiring from active involvement with the language.

It's a good time for it. My job is explaining C++ and how to use it, but the C++ explanation biz is bustling. The conference scene is richer and more accessible than ever before, user group meetings take place worldwide, the C++ blogosphere grows increasingly populous, technical videos cover everything from atomics to zero initialization, audio podcasts turn commute-time into learn-time, and livecoding makes it possible to approach C++ as a spectator sport. StackOverflow provides quick, detailed answers to programming questions, and the C++ Core Guidelines aim to codify best practices. My voice is dropping out, but a great chorus will continue.

Anyway, I'm only mostly retiring from C++. I'll continue to address errata in my books, and I'll remain consulting editor for the Effective Software Development Series. I may even give one more talk. (A potential conference appearance has been in the works for a while. If it gets scheduled, I'll let you know.)

"What's next?," you may wonder. I get that a lot. I've spent the last quarter century focusing almost exclusively on C++, and that's caused me to push a lot of other things to the sidelines. Those things now get a chance to get off the bench. 25 years of deferred activities begets a pretty long to-do list. The topmost entry? Stop trying to monitor everything in the world of C++ :-)

Scott

↧

The Website Obesity Crisis

December 31, 2015, 4:01 pm

≫ Next: Clojure 2015 Year in Review

≪ Previous: } // good to go

This is the text version of a talk I gave on October 29, 2015, at the Web Directions conference in Sydney. [53 minute video].

Let me start by saying that beautiful websites come in all sizes and page weights. I love big websites packed with images. I love high-resolution video. I love sprawling Javascript experiments or well-designed web apps.

This talk isn't about any of those. It's about mostly-text sites that, for unfathomable reasons, are growing bigger with every passing year.

While I'll be using examples to keep the talk from getting too abstract, I’m not here to shame anyone, except some companies (Medium) that should know better and are intentionally breaking the web.

What do I mean by a website obesity crisis?

Here’s an article on GigaOm from 2012 titled "The Growing Epidemic of Page Bloat". It warns that the average web page is over a megabyte in size.

The article itself is 1.8 megabytes long.

Here's an almost identical article from the same website two years later, called “The Overweight Web". This article warns that average page size is approaching 2 megabytes.

That article is 3 megabytes long.

If present trends continue, there is the real chance that articles warning about page bloat could exceed 5 megabytes in size by 2020.

The problem with picking any particular size as a threshold is that it encourages us to define deviancy down. Today’s egregiously bloated site becomes tomorrow’s typical page, and next year’s elegantly slim design.

I would like to anchor the discussion in something more timeless.

To repeat a suggestion I made on Twitter, I contend that text-based websites should not exceed in size the major works of Russian literature.

This is a generous yardstick. I could have picked French literature, full of slim little books, but I intentionally went with Russian novels and their reputation for ponderousness.

In Goncharov's Oblomov, for example, the title character spends the first hundred pages just getting out of bed.

If you open that tweet in a browser, you'll see the page is 900 KB big.

That's almost 100 KB more than the full text of The Master and Margarita, Bulgakov’s funny and enigmatic novel about the Devil visiting Moscow with his retinue (complete with a giant cat!) during the Great Purge of 1937, intercut with an odd vision of the life of Pontius Pilate, Jesus Christ, and the devoted but unreliable apostle Matthew.

For a single tweet.

Or consider this 400-word-long Medium article on bloat, which includes the sentence:

"Teams that don’t understand who they’re building for, and why, are prone to make bloated products."

The Medium team has somehow made this nugget of thought require 1.2 megabytes.

That's longer than Crime and Punishment, Dostoyevsky’s psychological thriller about an impoverished student who fills his head with thoughts of Napoleon and talks himself into murdering an elderly money lender.

Racked by guilt, so rattled by his crime that he even forgets to grab the money, Raskolnikov finds himself pursued in a cat-and-mouse game by a clever prosecutor and finds redemption in the unlikely love of a saintly prostitute.

Dostoevski wrote this all by hand, by candlelight, with a goddamned feather.

Here's a recent article called .

Rehearsing the usual reasons why bloat is bad, it includes the sentence “heavy pages tend to be slow pages, and slow pages mean unhappy users.”

That sentence might put you in mind of the famous opening line to Anna Karenina:

“All happy families are alike; every unhappy family is unhappy in its own way.”

But the not-so-brief history of bloat is much longer than Anna Karenina.

In fact, it's longer than War and Peace, Tolstoi’s exploration of whether individual men and women can be said to determine the great events of history, or whether we are simply swept along by an irresistible current of historical inevitability.

Here's an article from the Yorkshire Evening Post, typical of thousands of local news sites. It does not explore the relationship between history and individual will at all:

"Leeds Hospital Bosses Apologise After Curry and Crumble On The Same Plate".

This poignant story of two foods touching on a hospital plate could almost have been written by Marcel Proust, for whom the act of dipping a morsel of cake in a cup of tea was the starting point for an expanding spiral of vivid recollections, culminating in the realization, nine volumes and 3 megabytes of handwritten prose later, that time and memory themselves are only an illusion.

The javascript alone in "Leeds Hospital Bosses Apologise after Curry and Crumble On The Same Plate" is longer than Remembrance of Things Past.

I could go on in this vein. And I will, because it's fun!

Here is an instructional article on Best Practices for Increasing Online performance that is 3.1 MB long.

The article mentions that Google was able to boost user engagement in Google Maps by reducing the page weight from 100KB to 80KB.

Remember when Google Maps, the most sophisticated web app of its day, was thirty-five times smaller than a modern news article?

Web obesity can strike in the most surprising places.

Tim Kadlec, for example, is an excellent writer on the topic of performance. His personal site is a model of parsimony. He is full of wisdom on the topic of reducing bloat.

But the slides from his recent talk on performance are only available as a 9 megabyte web page, or a 14 megabyte PDF.

Let me close with a lovely TechTimes article warning that Google is going to start labeling huge pages with a special ‘slow’ mark in its mobile search interface.

The article somehow contrives to be 18 megabytes long, including (in the page view I measured) a 3 megabyte video for K-Y jelly, an "intimate lubricant".

It takes a lot of intimate lubricant to surf the unfiltered Web these days.

What the hell is up?

Everyone admits there’s a problem. These pages are bad enough on a laptop (my fan spun for the entire three weeks I was preparing this talk), but they are hell on mobile devices. So publishers are taking action.

In May 2015, Facebook introduced ‘Instant Articles’, a special format for news stories designed to appear within the Facebook site, and to load nearly instantly.

Facebook made the announcement on a 6.8 megabyte webpage dominated by a giant headshot of some dude. He doesn’t even work for Facebook, he’s just the National Geographic photo editor.

Further down the page, you'll find a 41 megabyte video, the only way to find out more about the project. In the video, this editor rhapsodizes about exciting misfeatures of the new instant format like tilt-to-pan images, which means if you don't hold your phone steady, the photos will drift around like a Ken Burns documentary.

Facebook has also launched internet.org, an effort to expand Internet access. The stirring homepage includes stories of people from across the developing world, and what getting Internet access has meant for them.

You know what’s coming next. When I left the internet.org homepage open in Chrome over lunch, I came back to find it had transferred over a quarter gigabyte of data.

Surely, you'll say, there's no way the globe in the background of a page about providing universal web access could be a giant video file?

But I am here to tell you, oh yes it is. They load a huge movie just so the globe can spin.

This is Facebook's message to the world: "The internet is slow. Sit and spin."

And it's not like bad connectivity is a problem unique to the Third World! I've traveled enough here in Australia to know that in rural places in Tasmania and Queensland, vendors treat WiFi like hundred-year-old brandy.

You're welcome to buy as much of it as you want, but it costs a fortune and comes in tiny portions. And after the third or fourth purchase, people start to look at you funny.

Even in well-connected places like Sydney, we've all had the experience of having a poor connection, and almost no battery, while waiting for some huge production of a site to load so we can extract a morsel of information like a restaurant address.

The designers of pointless wank like that Facebook page deserve the ultimate penalty.

They should be forced to use the Apple hockey puck mouse for the remainder of their professional lives. [shouts of horror from the audience]

Google has rolled out a competitor to Instant Articles, which it calls Accelerated Mobile Pages. AMP is a special subset of HTML designed to be fast on mobile devices.

Why not just serve regular HTML without stuffing it full of useless crap? The question is left unanswered.

The AMP project is ostentatiously open source, and all kinds of publishers have signed on. Out of an abundance of love for the mobile web, Google has volunteered to run the infrastructure, especially the user tracking parts of it.

Jeremy Keith pointed out to me that the page describing AMP is technically infinite in size. If you open it in Chrome, it will keep downloading the same 3.4 megabyte carousel video forever.

If you open it in Safari, where the carousel is broken, the page still manages to fill 4 megabytes.

These comically huge homepages for projects designed to make the web faster are the equivalent of watching a fitness video where the presenter is just standing there, eating pizza and cookies.

The world's greatest tech companies can't even make these tiny text sites, describing their flagship projects to reduce page bloat, lightweight and fast on mobile.

I can't think of a more complete admission of defeat.

The tech lead for Google's AMP project was nice enough to engage us on Twitter. He acknowledged the bloat, but explained that Google was "resource constrained" and had had to outsource this project.

This admission moved me deeply, because I had no idea Google was in a tight spot. So I spent a couple of hours of my own time making a static version of the AMP website.

I began by replacing the image carousels with pictures of William Howard Taft, America's greatest president by volume.

I think this made a marked improvement from the gratuitous animations on the original page.

By cutting out cruft, I was able to get the page weight down to half a megabyte in one afternoon of work. This is eight times smaller than the original page.

I offered my changes to Google free of charge, but they are evidently too resource constrained to even find the time to copy it over.

This project led me to propose the Taft Test:

Does your page design improve when you replace every image with William Howard Taft?

If so, then, maybe all those images aren’t adding a lot to your article. At the very least, leave Taft there! You just admitted it looks better.

I want to share with you my simple two-step secret to improving the performance of any website.

Make sure that the most important elements of the page download and render first.
Stop there.

You don't need all that other crap. Have courage in your minimalism.

To channel a famous motivational speaker, I could go out there tonight, with the materials you’ve got, and rewrite the sites I showed you at the start of this talk to make them load in under a second. In two hours.

Can you? Can you?

Of course you can! It’s not hard! We knew how to make small websites in 2002. It’s not like the secret has been lost to history, like Greek fire or Damascus steel.

But we face pressure to make these sites bloated.

I bet if you went to a client and presented a 200 kilobyte site template, you’d be fired. Even if it looked great and somehow included all the tracking and ads and social media crap they insisted on putting in. It’s just so far out of the realm of the imaginable at this point.

If you've ever struggled to lose weight, you know there are tricks people use to fool themselves into thinking they're thinner. You suck in your gut, wear a tight shirt, stand on a certain part of the scale.

The same situation obtains with performance testing. People have invented creative metrics to persuade themselves that their molasses-like websites load fast.

Google has a popular one called SpeedIndex. (You know it's from Google because they casually throw an integral sign into the definition.)

SpeedIndex is based on the idea that what counts is how fast the visible part of the website renders. It doesn't matter what's happening elsewhere on the page. It doesn't matter if the network is saturated and your phone is hot to the touch. It doesn't matter if the battery is visibly draining. Everything is OK as long as the part of the site in the viewport appears to pop into view right away.

Of course, it doesn’t matter how fast the site appears to load if the first thing the completed page does is serve an interstitial ad. Or, if like many mobile users, you start scrolling immediately and catch the 'unoptimized' part of the page with its pants down.

There is only one honest measure of web performance: the time from when you click a link to when you've finished skipping the last ad.

Everything else is bullshit.

In conversations with web performance advocates, I sometimes feel like a hippie talking to SUV owners about fuel economy.

They have all kinds of weirdly specific tricks to improve mileage. Deflate the front left tire a little bit. Put a magnet on the gas cap. Fold in the side mirrors.

Most of the talk about web performance is similarly technical, involving compression, asynchronous loading, sequencing assets, batching HTTP requests, pipelining, and minification.

All of it obscures a simpler solution.

If you're only going to the corner store, ride a bicycle.

If you're only displaying five sentences of text, use vanilla HTML. Hell, serve a textfile! Then you won't need compression hacks, integral signs, or elaborate Gantt charts of what assets load in what order.

Browsers are really, really good at rendering vanilla HTML.

We have the technology.

Nutritionists used to be big on this concept of a food pyramid. I think we need one for the web, to remind ourselves of what a healthy site should look like.

Here is what I recommend for a balanced website in 2015:

A solid base of text worth reading, formatted with a healthy dose of markup.
Some images, in moderation, to illustrate and punch up the visual design.
A dollop of CSS.
And then, very sparingly and only if you need it, JavaScript.

Instead, here is the web pyramid as we observe it in the wild:

Web designers! It's not all your fault.

You work your heart out to create a nice site, optimized for performance. You spend the design process trying to anticipate the user’s needs and line their path with rose petals.

Then, after all this work is done, your client makes you shit all over your hard work by adding tracking scripts and ads that you have no control over, whose origin and content will be decided at the moment the page loads in the user’s browser, and whose entire purpose is to break your design and distract the user from whatever they came to the site to do.

The user's experience of your site is dominated by hostile elements out of your control.

This is a screenshot from an NPR article discussing the rising use of ad blockers. The page is 12 megabytes in size in a stock web browser.

The same article with basic ad blocking turned on is one megabyte. It’s no model of parsimony, but still, what a difference a plugin makes.

If you look at what the unblocked version pulls in, it’s not just videos and banner ads, but file after file of javascript. Every beacon, tracker and sharing button has its own collection of scripts that it needs to fetch from a third-party server. Each request comes packed with cookies.

More cookies are the last thing your overweight website needs.

These scripts get served from God knows where and are the perfect vector for malware.

Advertisers will tell you it has to be this way, but in dealing with advertisers you must remember they are professional liars.

I don’t mean this to offend. I mean it as a job description. An advertiser's job is to convince you to do stuff you would not otherwise do. Their task in talking to web designers is to persuade them that the only way to show ads is by including mountains of third-party cruft and tracking.

The bloat, performance, and security awfulness, they argue, is the price readers pay for free content.

I've come across these diagrams of the "adtech ecosystem", which I love. They communicate the sordidness of advertising in the way simple numbers never could.

Here is a view of the adtech ecosystem in 2011, when there were 100 ‘adtech’ companies.

Here's how things stood in 2012, when there were 350 of them.

By 2014, we were blessed with 947.

And in 2015 we have 1876 of these things. They are all competing for the same little slice of your online spending.

This booming industry is very complex—I believe intentionally so.

When you're trying to understand a complex system, it can be helpful to zoom out and look at the overall flow of things.

For example, here's a German diagram showing the energy budget of the Earth.

All kinds of complicated things happen to sunlight when it shines on plants or water, but you can ignore them completely and just measure the total energy that comes in and out.

In the same spirit, let me sketch the way money is flowing in to the advertising bubble.

In the beginning, you have the consumer. In a misguided attempt at cultural sensitivity, I have chosen to represent the consumer with a kangaroo.

Consumers give money to merchants in exchange for goods and services. Here the red arrow represents money flowing to the merchant, or as you say in Australia, “dollars”.

A portion of this money is diverted to pay for ads. Think of it as a little consumption tax on everything you buy.

This money bounces around in the world of advertising middlemen until it ultimately flows out somewhere into someone's pocket.

Right now it's ending up in the pockets of successful ad network operators like Facebook, Yahoo!, and Google.

You’ll notice that there’s more money flowing out of this system than into it.

There’s a limit to how much money is available to ad companies from just consumers. Think of how many ads you are shown in a given day, compared to the number of purchases you actually make.

So thank God for investors! Right now they are filling the gap by pouring funding into this white-hot market. Their hope is that they will pick one of the few companies that ends up a winner.

However, at some point the investors who are pouring money in will want to move to the right-hand side of this diagram. And they'll want to get back even more money than they invested.

When this happens, and I believe it is happening right now, something will have to give.

Either we start buying more stuff, or a much bigger portion of our purchases goes to pay for ads…

Or the bubble is going to burst.

As it bursts, the remaining ad startups will grow desperate. They will search for ways to distinguish themselves from the pack with innovative forms of surveillance.

We’ll see a wave of consolidation, mergers, aggressive new forms of tracking, and the complete destruction of what remains of online privacy.

This why I've proposed we regulate the hell out of them now.

I think we need to ban third-party tracking, and third party ad targeting.

Ads would become dumb again, and be served from the website they appear on.

Accepted practice today is for ad space to be auctioned at page load time. The actual ads (along with all their javascript surveillance infrastructure) are pulled in by the browser after the content elements are in place.

In terms of user experience, this is like a salesman arriving at a party after it has already started, demanding that the music be turned off, and setting up their little Tupperware table stand to harass your guests. It ruins the vibe.

Imagine what server-side ad layout would mean for designers. You would actually know what your pages are going to look like. You could load assets in a sane order. You could avoid serving random malware.

Giant animations would no longer helicopter in at page load time, destroying your layout and making your users hate you.

In fact, let's be even bolder in our thinking. I'm not convinced that online publishing needs to be ad-supported at all.

People dismiss micropayments, ignoring the fact that we already have a de facto system of micropayments that is working well.

This chart from the New York Times shows how much money you spend per page load on an American cell phone network, based on the bandwidth used. For example, it costs thirty cents to load a page from Boston.com on a typical data plan.

This is nothing more than a micropayment to the telecommunications company. And I'm sure it's more revenue than Boston.com sees from the ad impressions on the page.

We're in a stupid situation where ads make huge profits for data carriers and ad networks, at the expense of everyone else.

Advertisers will kick and scream at any attempt to make them go back to the dumb advertising model. But there's no evidence that dumb ads are any worse than smart ones.

For years and years, poorly targeted advertising brought in enough money to fund entire television studios, radio shows, and all kinds of popular entertainment. Dumb ads paid for the Batmobile.

It costs a lot less to pay for a couple freelance journalists and a web designer than it does to film a sitcom. So why is it unthinkable to force everyone back to a successful funding model that doesn't break privacy?

Of course, advertisers will tell us how much better TV in the old days could have been if they had been able to mount a camera on top of every set.

But we’ve heard enough out of them.

Dumb ads will mean less ad revenue, because a lot of online ad spending is fueled by extravagant promises around the possibilities of surveillance technology.

But the ad market is going to implode anyway when the current bubble bursts.

The only question for publishers is whether to get ahead of this and reap the benefits, or circle down the drain with everybody else.

Let’s talk about a different cause of web obesity.

Fat assets!

This has been a problem since forever, but as networks get faster, and publishing workflows get more complicated, it gets easier to accidentally post immense files to your website.

Examples!

Here’s a self-righteous blogger who likes to criticize others for having bloated websites. And yet there's a gratuitous 3 megabyte image at the top of his most recent post.

Presumably this was a simple case of forgetting to resize an image. Without loading it on a slow connection, it's hard to notice the mistake.

Making networks faster makes this problem worse.

Here's a recent photo of a traffic jam in China. There are 50 lanes of cars here. Adding a 51st lane is not going to make things any better.

Similarly, adding network capacity is not going to convince people to start putting less stuff on their website.

Consider this recent Verge article about botnets.

At the top of the article is a pointless 3 megabyte photograph of headphones. This page fails the Taft Test.

This is part of a regrettable trend, made possible by faster networks, of having ‘hero images’ whose only purpose is for people to have something to scroll past.

In this case there's no use blaming the author. Something in the publishing toolchain failed to minimize this enormous image.

But the larger problem is that fast networks encourage people to include this kind of visual filler.

As we rely more and more on compression tricks, minimization, caching, and server configuration, mistakes become harder to catch and potentially more expensive.

Here's another example, interesting for two reasons.

First, the original image quality is awful. The picture looks like it was taken with a potato because it's a screen capture from a TV show.

Nevertheless, the image is enormous. If you load this website in Safari, the image is several megabytes in size.

If you load it in Chrome, it’s 100 kilobytes, because Chrome supports an on-the-fly compression format that Safari doesn't.

With these complicated optimization pipelines, it’s hard to be sure you’re seeing the same thing as your audience.

As a bonus, if you scroll to the bottom of the page, you see that a tiny animated GIF in the part of the page layout designers call "chum" is over a megabyte in size.

It’s an useless piece of clickbait, but contributes massively to the overall weight of the page.

No one has fatter assets than Apple. Their site is laughably bloated. I think it may be a marketing trick.

"These images load crazy slow on my crappy Android phone, I can’t wait to get one of those Apple devices!"

Let's take a look at the Apple page that explains iOS on the iPad Pro.

How big do you think this page is?

Would you believe that it's bigger than the entire memory capacity of the iconic iMac? (32 MB)

In fact, you could also fit the contents of the Space Shuttle Main Computer. Not just for one Shuttle, but the entire fleet (5 MB).

And you would still have room for a tricked out Macintosh SE... (5MB).

...and the collected works of Shakespeare... (5 MB)

With lots of room to spare. The page is 51 megabytes big.

These Apple sites exemplify what I call Chickenshit Minimalism. It's the prevailing design aesthetic of today's web.

I wrote an essay about this on Medium. Since this is a fifty minute talk, please indulge me while I read it to you in its entirety:

"Chickenshit Minimalism: the illusion of simplicity backed by megabytes of cruft."

I already talked about how bloated Medium articles are. That one-sentence essay is easily over a megabyte.

It's not just because of (pointless) javascript. There's also this big image in the page footer.

Because my article is so short, it's literally impossible to scroll down to see it, but with developer tools I can kind of make out what it is: some sort of spacesuit people with tablets and mobile phones.

It's 900 kilobytes in size.

Here’s another example of chickenshit minimialism: the homepage for Google’s contributor program.

This is a vast blue wasteland, 2 megabytes in size, that requires you to click three times in order to read three sentences.

The last sentence will tell you that the program is not available here in Australia.

Here’s the homepage for the Tatamagouche Brewing Company. The only thing on it is a delicious beer. All the navigation has been tucked away into a hamburger menu.

Tucking into hamburgers is not the way to fix your flabby interface.

Design companies love this invisible hamburger antipattern.

Here's the 3 megabyte homepage for a company called POLLEN. You can barely even see the hamburger up there.

For the ultimate example of the chickenshit aesthetic, take a look at Verge’s review of the Apple watch.

But please don't load this on your phones right now, or you're going to bring down the conference wifi.

The Verge review is a UI abomination that completely hijacks the scroll mechanic of your browser. As you try to scroll down, weird things happen.

Interface elements slide in from the left.

Interface elements slide in from the right.

Interface elements you haven't seen since middle school call you unexpectedly in the middle of the night.

Once in a great while, the page actually scrolls down.

And what mainly happens is the fan on your laptop spins for dear life.

I tried to capture a movie of myself scrolling through the Verge Apple watch review, but failed. The graphics card on my late-model Apple laptop could not literally not cope with the load.

Some kind of brain parasite infected designers back when the iPad came out, and they haven’t recovered. Everything now has to look like a touchscreen.

This is the UK version of Wired, another site that has declared war on the scroll event.

You can try to scroll down, but it will just obstinately move you to the right instead. Article titles show up as giant screen-eating tiles of cruft.

Another hallmark of iPad chic are these elegant infographics in unreadable skinny white font on a light background.

Book a flight on Virgin America and you'll encounter this column of giant buttons floating in a sea of red.

This interface may look clean on a phone, but on a large screen it's just terrifying.

The "Book" button on that screen takes you to a land of vast input fields.

Note the hallmark ecosystem of giant fonts, tiny fonts, and extremely pale fonts.

After you decide where to go, the site takes you to this calendar widget.

It has equally enormous buttons, but the only piece of information I'm interested in—the price of the flight on each day—appears in microscopic type under the date.

My gripe with this design aesthetic is the loss of information density. I'm an adult human being sitting at a large display, with a mouse and keyboard. I deserve better.

Not every interface should be designed for someone surfing the web from their toilet.

Here's what the PayPal site used to look like.

I never fell to my knees to thank God for giving me the gift of sight so that I might behold the beauty of the old PayPal interface.

But it got the job done.

Here's the PayPal website as it looks today.

The biggest element on the page is an icon chastising me that I haven't told PayPal what I look like. Next to that is a useless offer to 'download the app', and then an offer for a credit card.

I can no longer control the sort order, there are no filter tools, and you see there are far fewer entries visible without scrolling.

Here is a Google 'control panel' that lets you configure your 'ad preferences'. It is similarly toylike and visually bloated.

It's like we woke up one morning in 2008 to find that our Lego had all turned to Duplo. Sites that used to show useful data now look like cartoons.

Interface elements are big and chunky. Any hint of complexity has been pushed deep into some sub-hamburger.

Sites target novice users on touchscreens at everyone else's expense.

Another example of this interface bloat: the Docker homepage. It consists of faint text separated by enormous swathes of nothingness.

I shouldn't need sled dogs and pemmican to navigate your visual design.

Search pages are where the pain hits hardest. Giant lettering and fat buttons replace the one thing anyone needs to see—a list of search results.

Here's a design where there's room for only one result, again on a giant high-resolution monitor.

I hate to do it, but I have to call out responsive design.

Everyone recognizes that it's challenging to make a site that looks good at all screen sizes.

But the emphasis on screen size has obscured an important difference in how people interact with interface elements.

On a phone, people are poking at a small screen with the meat styluses hanging off their arms. In that scenario, it makes sense to have big buttons.

On a large screen, where you have acres of space and an exquisitely sensitive pointing device, the same interface is maddening.

There may be no way to split the difference. I feel like designers are just waiting for us all to stop using laptops.

This is a typical recipe site grappling with this UI problem. I don't want to pick on it, because it's trying very hard.

But notice how some elements are tiny, and some are huge. Half the page is in the idiom of touch interfaces, and the whole thing is hard to read.

Here is the Forbes homepage, as seen with the left hamburger menu expanded. It looks like a random chunk of memory that accidentally got rendered to the video card.

There are multiple icons for social sharing, up arrows, down arrows, a smorgasbord of fonts.

And sitting confidently atop it all is big a fat turd of a banner ad, with its own ideas about typography and layout.

This is no way to live. We're not animals!

Finally, I want to talk about our giant backends.

How can we expect our web interfaces to be slim when we're setting such a bad example on the server side?

I have a friend who bakes cookies for a living. Like a lot of home bakers, she started by using her own kitchen, running it at full capacity until everything was covered in flour and her apartment was tropically hot.

At a certain point, she realized she needed to buy commercial baking equipment.

Being good at baking cookies doesn't teach you anything about how to buy professional restaurant equipment.

For a home cook, it's terrifying to have to purchase a commercial oven, cooling racks, an industrial mixer, and start buying ingredients in fifty-pound sacks.

It's even scarier to hire staff, rent kitchen space, and get health permits One mistake can end your business.

For years, the Internet worked the same way. You could run a small website off of commodity servers, but if your project started gaining traction, you found yourself on the phone with a silken-voiced hardware salesperson about signing contracts for equipment, bandwidth, and colocation space.

It was easy to get out of your depth and make expensive mistakes.

Amazon Web Services changed everything. They offered professional tools on demand, by the hour, at scale. You didn't have to shop for hardware, and you weren't locked in to owning it. You paid a premium for the service, but it removed a ton of risk.

Of course, you still had to learn how to use this stuff. But that was actually fun.

There was always a catch. The gas burners on the stoves were kind of small. The handles would occasionally fall off the frying pans, at unexpected times.

And to their credit, Amazon warned you about this up front, and told you to design your procedures with failures in mind.

Some things were guaranteed to never fail—the freezers, say, would always stay below freezing. But maybe you wouldn't be able to unlock the doors for several hours at a time.

As people began moving to the cloud, it forced them to think at a bigger scale. They had to think in terms of multiple machines and availability zones, and that meant thinking about redundancy, failure tolerance.

All of these are good things at scale, but overkill for a lot of smaller sites. You don't need an entire restaurant kitchen staff to fry an egg.

As the systems got bigger, Amazon started offering more automation. They would not only rent you giant ovens, but a fleet of kitchen robots that you could program do all kinds of mundane tasks for you.

And again, it was way more fun to program the robots than to do the mundane kitchen tasks yourself.

For a lot of tech companies, where finding good programmers is harder than finding money, it made sense to switch over to the highly automated cloud services entirely.

For programmers, the cloud offered a chance to design distributed systems across dozens or hundreds of servers early in their careers. It was like getting the keys to a 747 right out of flight school.

Most website work is pretty routine. You hook up a database to a template, and make sure no one trips over the power cord.

But automation at scale? That's pretty sweet, and it's difficult!

It's like you took a bunch of small-business accountants and told them they were going to be designing multi-billion dollar corporate tax shelters in the Seychelles.

Suddenly they feel alive, they feel free. They're right at the top of Maslow's hierarchy of needs, self-actualizing on all cylinders. They don't want to go back.

That's what it feels like to be a programmer, lost in the cloud.

Complexity is like a bug light for smart people. We can't resist it, even though we know it's bad for us. This stuff is just so cool to work on.

The upshot is, much of the web is horribly overbuilt.

Technologies for operating at scale developed by companies that need them end up in the hands of people who aspire to work at those scales.

And there's no one to say "no".

Adam Drake wrote an engaging blog post about analyzing 2 million chess games. Rather than using a Hadoop cluster, he just piped together some Unix utilities on a laptop, and got a 235-fold performance improvement over the 'Big Data' approach.

The point is not that people using Hadoop clusters are foolish, or that everything can be done on a laptop. It's that many people's intuition about what constitutes a large system does not reflect the reality of 2015 hardware.

You can do an awful lot on a laptop, or pizza box web server, if you skip the fifty layers of overhead.

Let me give you a concrete example. I recently heard from a competitor, let’s call them ACME Bookmarking Co., who are looking to leave the bookmarking game and sell their website.

While ACME has much more traffic than I do, I learned they only have half the daily active users. This was reassuring, because the hard part of scaling a bookmarking site is dealing with people saving stuff.

We both had the same number of employees. They have an intern working on the project part time, while I dither around and travel the world giving talks. Say half a full-time employee for each of us.

We have similar revenue per active user. I gross $12,000 a month, they gross $5,000.

But where the projects differ radically is cost. ACME hosts their service on AWS, and at one point they were paying $23,000 (!!) in monthly fees. Through titanic effort, they have been able to reduce that to $9,000 a month.

I pay just over a thousand dollars a month for hosting, using my own equipment. That figure includes the amortized cost of my hardware, and sodas from the vending machine at the colo.

So while I consider bookmarking a profitable business, to them it's a $4,000/month money pit. I'm living large off the same income stream that is driving them to sell their user data to marketers and get the hell out of the game.

The point is that assumptions about complexity will anchor your expectations, and limit what you're willing to try. If you think a 'real' website has to live in the cloud and run across a dozen machines, a whole range of otherwise viable projects will seem unprofitable.

Similarly, if you think you need a many-layered CMS and extensive custom javascript for an online publishing venture, the range of things you will try becomes very constricted.

Rather than trying to make your overbuilt projects look simple, ask yourself if they can't just be simple.

I don't want to harsh on the cloud. Some of my best friends are in the cloud.

Rather, I want to remind everyone there’s plenty of room at the bottom. Developers today work on top of too many layers to notice how powerful the technology has become. It’s a reverse princess-and-the-pea problem.

The same thing happens in the browser. The core technology is so fast and good that we’ve been able to pile crap on top of it and still have it work tolerably well.

One way to make your website shine is by having the courage to let the browser do what it's optimized to do. Don't assume that all your frameworks and tooling are saving you time or money.

Unfortunately, complexity has become a bit of a bragging point. People boast to one another about what's in their 'stack', and share tips about how to manage it.

"Stack" is the backend equivalent to the word "polyfill". Both of them are signs that you are radically overcomplicating your design.

There's a reason to care about this beyond just aesthetics and efficiency.

Let me use a computer game analogy to express two visions of the future Web.

The first vision is the Web as Minecraft—an open world with simple pieces that obey simple rules. The graphics are kind of clunky, but that’s not the point, and nobody cares.

In this vision, you are meant to be an active participant, you're supposed to create stuff, and you'll have the most fun when you collaborate with others.

The rules of the game are simple and don't constrain you much.

People create astonishing stuff in Minecraft.

Here is an entire city full of skyscrapers, lovingly tended.

Here are some maniacs who have built an entire working CPU out of redstone. if this were scaled up big enough, it could also run Minecraft, which is a mind-bending thought.

The game is easy to learn and leaves you to your own devices. Its lack of polish is part of its appeal.

The other vision is of the web as Call of Duty—an exquisitely produced, kind-of-but-not-really-participatory guided experience with breathtaking effects and lots of opportunities to make in-game purchases.

Creating this kind of Web requires a large team of specialists. No one person can understand the whole pipeline, nor is anyone expected to. Even if someone could master all the technologies in play, the production costs would be prohibitive.

The user experience in this kind of Web is that of being carried along, with the illusion of agency, within fairly strict limits. There's an obvious path you're supposed to follow, and disincentives to keep you straying from it. As a bonus, the game encodes a whole problematic political agenda. The only way to reject it is not to play.

Despite the lavish production values, there's a strange sameness to everything. You're always in the same brown war zone.

With great effort and skill, you might be able make minor modifications to this game world. But most people will end up playing exactly the way the publishers intend. It's passive entertainment with occasional button-mashing.

Everything we do to make it harder to create a website or edit a web page, and harder to learn to code by viewing source, promotes that consumerist vision of the web.

Pretending that one needs a team of professionals to put simple articles online will become a self-fulfilling prophecy. Overcomplicating the web means lifting up the ladder that used to make it possible for people to teach themselves and surprise everyone with unexpected new ideas.

Here's the hortatory part of the talk:

Let’s preserve the web as the hypertext medium it is, the only thing of its kind in the world, and not turn it into another medium for consumption, like we have so many examples of already.

Let’s commit to the idea that as computers get faster, and as networks get faster, the web should also get faster.

Let’s not allow the panicked dinosaurs of online publishing to trample us as they stampede away from the meteor. Instead, let's hide in our holes and watch nature take its beautiful course.

Most importantly, let’s break the back of the online surveillance establishment that threatens not just our livelihood, but our liberty. Not only here in Australia, but in America, Europe, the UK—in every free country where the idea of permanent, total surveillance sounded like bad science fiction even ten years ago.

The way to keep giant companies from sterilizing the Internet is to make their sites irrelevant. If all the cool stuff happens elsewhere, people will follow. We did this with AOL and Prodigy, and we can do it again.

For this to happen, it's vital that the web stay participatory. That means not just making sites small enough so the whole world can visit them, but small enough so that people can learn to build their own, by example.

I don't care about bloat because it's inefficient. I care about it because it makes the web inaccessible.

Keeping the Web simple keeps it awesome.

Thank you very much!

HEAVY, ROILING, TROUBLED SEAS OF APPLAUSE

Sincere thanks to Michael Krakovskiy, Jeremy Keith, Nick Heer, and lots of Twitter friends for their help with this talk.

↧

Clojure 2015 Year in Review

January 1, 2016, 1:38 am

≫ Next: Tor Anonymity: Things Not to Do

≪ Previous: The Website Obesity Crisis

Another year, another year-in-review post. To be honest, I feel like any attempt I make to summarize what happened in the Clojure world this year is largely moot. Clojure has gotten so big, so — dare I say it? — mainstream that I can’t even begin to keep up with all the interesting things that are happening. But it’s a tradition, so I’ll stick to it. Once again, here is my incomplete, thoroughly-biased list of notable Clojurey things this year.

As I said of JVM Clojure in 2012, I think I can safely say that 2015 was the year ClojureScript grew up. It got a real release number, improved REPL support, and the ability to compile itself. But you don’t have to take my word for it: David Nolen has written his own ClojureScript Year in Review.

Clojure in the World

We already knew Clojure was being used at big companies like Walmart and Amazon. Based on public job postings, we’ve also seen places like Reuters, Capital One, and Oracle interested in Clojure developers.

Big corporations tend to be cagey about their technology choices, but Walmart’s Anthony Marcar came to Clojure/West to talk about how they do Clojure at Scale.

In other big-tech news, Facebook acquired Wit.ai, a Clojure startup that released an open-source library to parse structured data from text. Clojure early-adopter Prismatic pivoted away from its popular news-recommendation app to focus full-time on the A.I. business as well.

Language, Tools, and Libraries

Clojure 1.7 was released, bringing Transducers and the much-anticipated Reader Conditionals to support mixed Clojure-ClojureScript projects. Writing cross-platform libraries suddenly got easier. A bunch of popular Clojure libraries were ported to ClojureScript, including test.check, tools.reader, and my Component.

core.async got a major new release, with the added features promise-chan, offer!, and poll!.

The big news on the tooling front was the 1.0 release of Cursive, the first commercial IDE for Clojure. On the open-source side, both Light Table and CIDER got major new releases.

In the ClojureScript tooling world, Figwheel and Devcards really took off this year.

Clojars started getting financial support from the community, and CLJSJS started offering JavaScript libraries conveniently packaged for ClojureScript and the Google Closure Compiler.

Books and Docs

clojure.org went open-source for contributions from the community.

New books: Clojure Applied (my review), Clojure for the Brave and True in print, Living Clojure, Clojure Recipes, and many more.

Events and Community

The Clojurians Slack community rocketed from just an idea to over four thousand members. If you don’t care for Slack, the #clojure IRC channel on Freenode is still going.

The Clojure mailing list hit ten thousand members.

At Clojure/conj this year, we had the first-ever Datomic conference. You can binge-watch Clojure conference videos (Clojure/conj, EuroClojure, and Clojure/West) on the ClojureTV YouTube channel. Also check out Clojure eXchange and :clojureD.

Clojure is attracting some interest from academic computer science, including a new paper on optimizing immutable hash maps.

Summary

There’s not much more to say. Or rather, there is very much more to say than what I can capture in a single post. Clojure is here to stay. Let’s enjoy it.

Thanks to David Nolen, Alex Miller, Timothy Baldridge, Carin Meier, and Daemian Mack for their help preparing this post.

↧

Tor Anonymity: Things Not to Do

January 3, 2016, 6:01 pm

≫ Next: Untangling an Accounting Tool and an Ancient Incan Mystery

≪ Previous: Clojure 2015 Year in Review

1Things NOT to do

I wonder what my site looks like when I'm anonymous.[edit]

"I wonder what my site looks like when I'm anonymous"^[1]

It's best not to visit your own personal website where either real names or pseudonyms (which have ever been tied to a non-Tor connection/IP) are attached. Because how many people are visiting your personal website? 90% of all Tor users, or just you, or just very few other people? That's weak anonymity. Once you visit a website your Tor circuit gets dirty. The exit relay knows that someone visited your website and if the site is not that popular, it's a good guess that 'someone' was you. It wouldn't be hard to assume that further connections originating from that exit relay come from your machine.

Source: ^[2]

Login into your real life Facebook account and think you are anonymous.[edit]

Don't login into your personal Facebook account. No matter if your real name is attached or only a pseudonym. You most likely added your friends and they know who the account belongs to. Through your social network Facebook can guess who you are.

No anonymity solution is magic. Online anonymity software may reliable hide your IP/location. But Facebook does not need your IP/location. They already know who you are, who your friends are, which private messages you send and so on. All those data is at least stored on Facebook server. No kind of software can delete that data. Only Facebook and crackers could.

So if you log into your personal Facebook account you only get location privacy. No anonymity.

Quoted from "To Toggle, or not to Toggle: The End of Torbutton"^[3]:

mike, am i completely anonymized if i log onto my facebook account? im using firefox 3.6 with tor and no script on windows 7 machine. thank you.

Never login into accounts you ever used without Tor.[edit]

Always assume that each time you visit a website, that they will log the IP/location which visited the website, at which time and what you did.

Also assume, that each time you're online your ISP (Internet Service Provider) will log your online time, IP/location and perhaps traffic. Your ISP could also log to which IPs/locations you connected, how much traffic and what you send and retrieved. (Unless it's encrypted, then they'll see only garbage.) The following tables should give you an simplified overview how those logs could look like.

ISP Log:

Name	Time	IP/location	traffic
John Doe	16 pm to 17 pm	1.1.1.1	500 megabytes

Extended^[4] ISP Log:

Name	Time	IP/location	traffic	Destination	Content
John Doe	16 pm to 17 pm	1.1.1.1	1 megabytes	google.com	searched for thing one, thing two...
John Doe	16 pm to 17 pm	1.1.1.1	490 megabytes	youtube.com	view video 1, video 2, ...
John Doe	16 pm to 17 pm	1.1.1.1	9 megabytes	facebook.com	encrypted traffic

Website Log:

Name	Time	IP/location	traffic	Content
-	16.00 pm to 16.10 pm	1.1.1.1	1 megabytes	searched for thing one, thing two...

You'll see, when websites and ISP keep logs, no one needs Sherlock Holmes to conclude.

If you mess up for one time, and login with a non-Tor connection/IP, which can be tied to you, then the whole account is compromised.

Don't login into your bank account, paypal, ebay or other important personal accounts unless...[edit]

Logging into your bank, paypal, ebay or other important personal accounts registered on your name where money is involved could risk, that your account gets suspended, due to "suspicious activity" by the fraud prevention system. This is because crackers sometimes use Tor for committing fraud. That's probably not what you want.

It's not anonymous for reasons already explained. It's pseudonymous and offers circumvention (in case access to the site is blocked by your ISP) and location privacy. The difference of anonymity and pseudonymity is covered in a later chapter on this page.

In many cases you will be able to contact the support and to get your account unblocked again or on request, even the fraud protection policy gets relaxed for your account.

Whonix developer adrelanos is not against using Tor for circumvention and/or location privacy, you just should know the risk of your account getting (temporarily) suspended and the other things mentioned on this page and the other warnings from the Whonix documentation. So if you know what you are doing, feel free.

Don't alternate Tor with open WiFi.[edit]

You may think open WiFi is faster and equally safe as Tor since the IP/location cannot be tied to your real name, right?

It's better to use an open WiFi AND Tor, but not an open WiFi OR Tor.

The approximate location of any IP address can be tied down to a city, region or even a street. Even if you are away, you still gave away your city or approximate location since most people don't switch continents.

You don't know who is running the open WiFi router or their policies. They could be keeping logs of your MAC address and tie it with the activity you are sending in the clear through them.

While this doesn't break your anonymity, the circle of suspect has decreased from the entire world, a continent, or the country to a region. This strongly hurts your anonymity. Keep as much information as possible to yourself.

Prevent Tor over Tor scenarios.[edit]

Whonix specific.

When using a transparent proxy (Whonix includes one), it is possible to start a Tor session from the client as well as from the transparent proxy, creating a "Tor over Tor" scenario.

This happens when installing Tor inside Whonix-Workstation or when using the Tor Browser Bundle without configuring it to use a SocksPort instead of the TransPort. (Covered in the Tor Browser article.)

Doing so produces undefined and potentially unsafe behavior. In theory, however, you can get six hops instead of three, but it is not guaranteed that you'll get three different hops - you could end up with the same hops, maybe in reverse or mixed order. It is not clear if this is safe. It has never been discussed.

You can choose an entry/exit point ^[5], but you get the best security that Tor can provide when you leave the route selection to Tor; overriding the entry / exit relays can mess up your anonymity in ways we don't understand. Therefore Tor over Tor usage is highly discouraged.

License of "Prevent Tor over Tor scenarios.": ^[6]

Don't send sensitive data without end-to-end encryption.[edit]

As already explained on the Warning page, Tor exit relays can eavesdrop on communications and other Man-in-the-middle attacks can happen. The only way to get sensitive data from the sender to the recipient while withholding it from third parties, is using end-to-end encryption.

Don't disclose identifying data about yourself.[edit]

Deanonymsation is not only possible with connections/IP addresses but by social threats too. Some recommendations to avoid deanonymisation collected by Anonymous:

Do not include personal informations in your nickname
Do not discuss personal informations, where you are from...
Do not mention your gender, tattos, piercings or physical capacities.
Do not mention your profession, hobbies or involvement in activist groups.
Do not use special characters on keyboard, which are existent only in your language.
Do not post informations to the regular internet while you are anonymous. Do not use Twitter and Facebook. This is easy to correlate.
Do not post links to facebook images. The image name contains a personal ID.
Do not connect to same destination at the same time. Try to alternate.
IRC, other chats, forum, mailing list, etc. are public, keep that in mind.
Heroes only exist in comic books keep that in mind! There are only young heroes and dead heroes.

If it's a must to disclose identifying data about yourself, treat it as "sensitive data" in the point above.

License: From the JonDonym documentation (Permission).

Do use bridges if you think Tor usage is dangerous/suspicious in your country.[edit]

Quoted from the Bridges page: "Bridges are important tools that work in many cases but they are not an absolute protection against the technical progress that an adversary could do to identify Tor users."

Don't use different online identities at the same time.[edit]

They easily correlate. Whonix doesn't magically separate your different contextual identities.

Do not mix Modes of Anonymity![edit]

Let us begin with an overview of the different Modes of Anonymity:

mode(1): user anonymous; any recipient[edit]

Scenario: post anonymously a message in a message board/mailing list/comment field
Scenario: whistleblower and such
You are anonymous.
Your real IP/location stays hidden.
Location privacy: your location remains secret.

mode(2): user knows recipient; both use Tor[edit]

Scenario: both sender and recipient know each other and both use Tor.
They can communicate with each other without any third party being wise to their activity or even to the knowledge that they are communicating with each other.
You are NOT anonymous.
Your real IP/location stays hidden.
Location privacy: your location remains secret.

mode(3): user with no anonymity using Tor; any recipient[edit]

Scenario: login with your real name into any services, such as webmail, Twitter, Facebook, etc...
You are obviously NOT anonymous. As soon as you log into an account where you entered your real name the website knows your identity. Tor can not make you anonymous in these situations.
Your real IP/location stays hidden.
Location privacy. Your location remains secret.

mode(4): user with no anonymity; any recipient[edit]

Scenario: normal browsing without Tor.
You are NOT anonymous.
Your real IP/location gets revealed.
Your location gets revealed.

Conclusion[edit]

It's not wise to combine mode(1) and mode(2). For example, if you have an IM or email account and use that via mode(1), you are advised not to use the same account for mode(2). We have explained previously why this is an issue.

It's also not wise to mix two or more modes inside the same Tor session, as they could share the same exit relay (identity correlation).

It's also possible that other combinations of modes are dangerous and could lead to the leakage of personal information or your physical location.

License[edit]

License of "Do not mix Modes of Anonymity!": ^[6]

Don't change settings if you don't know their consequences.[edit]

Changing user interface settings for applications, which do not connect to the internet, mostly safe. For example, checking a box "don't show this tip of the day anymore" or "hide this menu bar" will have no effect on anonymity.

Look into the Whonix documentation, if changing the settings you are interested in, is documented or recommended against; try to live with the defaults.

Changing settings for applications, which connect to the internet, even user interface settings, has to be thoroughly reviewed. For example removing a menu bar or using Full Screen in Tor Browser is recommended against. The latter is known to modify the screen size, which is bad for the web fingerprint.

You should only modify network settings with great care if you know their consequences. For example, you should stay away from the advice related to "Firefox Tuning". If you believe the settings are suboptimal, the changes should be proposed upstream, so they get changed for all Tor Browser users with the next release.

Do not use clearnet and Tor at the same time.[edit]

Using your non-Tor browser and Tor Browser at the same time, risks that you at some point confuse one for the other and deanonymize yourself.

Using clearnet and Tor at the same time also risks that you connect to a server anonymously and non-anonymously at the same time, which is recommended against. The reason for this is explained in the point below. You never know when you visit the same page anonymously and non-anonymously at the same time, because you only see the url you're visiting, not how many resources are fetched in background. Many different websites are hosted in the same cloud. Services such as google analytics are on the majority of all websites and therefore see a lot anonymous and non-anonymous connections.

If you really want not to follow this recommendation, use at least two different desktops to prevent confusing one browser for another.

Do not connect to any server anonymously and non-anonymously at the same time![edit]

It's highly recommended that you do not connect to any remote server in this manner. That is, do not create a Tor link and a non-Tor link to the same remote server at the same time. In the event your internet connection breaks down (and it will eventually), all your connections will break at the same time and it won't be hard for an adversary to put the pieces together and determine what public IP/location belongs to what Tor IP/connection, potentially identifying you directly. Another attack a webserver could do is increasing or decreasing the speed of the either non-Tor or Tor link and then see if there is a correlation of either connection getting faster or slower, thereby concluding which non-Tor link belongs to which Tor link.

License of "Do not connect to any server anonymously and non-anonymously at the same time!": ^[6]

Do not confuse Anonymity with Pseudonymity.[edit]

This chapter explains the difference between anonymity and pseudonymity. Word definitions are always a difficult topic because a majority of people has to agree with it.

An anonymous connection is defined as a connection to a destination server, where the destination server has no means to find out the origin (IP/location) of that connection nor to associate and an identifier ^[7] to it.

A pseudonymous connection is defined as a connection to a destination server, where the destination server has no means to find out the origin (IP/location) of a connection, but can associate it with an identifier ^[7].

In an ideal world, the Tor network, Tor Browser (and the underlying operating system, hardware, physical security, etc.) is perfect. For example the user could fetch a news website and neither the news website nor the website's ISP has any idea if that user has ever contacted the news website before. ^[8]

The opposite of this, when using software incorrectly, for example using Firefox instead of the Tor-safe browser Tor Browser, the original (IP/location) of a connection is still hidden, but an identifier (for example Cookies) can be used to make that connection pseudonymous. The destination website could log for example "user with id 111222333444 viewed video title a at time b on date c, video title d at time e at date f.". These information can be used for profiling. Over time these profiles become more and more comprehensive, which reduces anonymity, i.e. in worst case it could lead to de-anonymization.

As soon as someone logs into a website (for example into a forum or e-mail address) with a username the connection is by definition no longer anonymous, but pseudonymous. The origin of the connection (IP/location) is still hidden, but the connection can be associated with an identifier ^[7], i.e. in this case, an account name. Identifiers can be used to keep a log of various things. When a user wrote what, date and time of login and logout, what a user wrote, to whom the user wrote, IP address (useless, if it's a Tor exit relay), browser fingerprint etc.

Maxim Kammerer, developer of Liberté Linux ^[9], has a interesting different opinion. ^[10] I don't want to withhold from you:

I have not seen a compelling argument for anonymity, as opposed to pseudonymity. Enlarging anonymity sets is something that Tor developers do in order to publish incremental papers and justify funding. Most users only need to be pseudonymous, where their location is hidden. Having a unique browser does not magically uncover user's location, if that user does not use that browser for non-pseudonymous activities. Having good browser header results on anonymity checkers equally does not mean much, because there are many ways to uncover more client details (e.g., via Javascript oddities).

Don't be the first one to spread your own link.[edit]

You created an anonymous blog or hidden service? Great. You have a twitter account with lots of followers, run a big clearnet news page or similar? Great. Do not be tempted to be one of the first ones to advertise your new anonymous project! The more you separate identities, the better. Of course, at some point you may or even must be "naturally" aware of it, but be very careful at this point.

Don't open random files or links.[edit]

Someone sent you an pdf by mail or gave you a link to a pdf? That sender/mailbox/account/key could be compromised and the pdf could be prepared to infect your system. Don't open it with the default tool you were expected use with by the creator. For example, don't open a pdf with a pdf viewer. If the content is public anyway, try using a free online pdf viewer.

Don't do (mobile) phone verification.[edit]

Websites such as Google, Facebook and others will ask for a (mobile) phone number if you login over Tor. Unless you are really clever or have an alternative, you shouldn't do it.

Reason: The number you give away will be logged. The SIM card is most likely registered on your name. And even if not, receiving an SMS gives away your location. Even if you anonymously bought a SIM card and do it from a point far away from your home, there is still a risk: the phone itself. Each time the phone logs into the mobile network, the provider will log the SIM card serial number ^[11] AND the phone serial number ^[12]. If you bought the SIM card anonymously, but not the phone, it's not anonymous, because these two serials will get linked. If you really want to do mobile verification, you need a spot far away from your home, a fresh phone, and a fresh SIM card. Afterwards, you must turn off the phone, and burn both the phone and the SIM card right after doing it.

You could try to find an online service receiving SMS for you. That would work and would be anonymous. The problem is, that it most likely won't work for Google and Facebook, because they actively blacklist such numbers for verification. Or you could try to find someone else receiving the SMS for you, but that would only shift the risk from you to the other person.

Why this page?[edit]

You can skip "Why this page?".

This page highly risks to state obvious things. Obvious to whom? Developers, hackers, geeks, etc. may call that common sense.

Those groups tend to lose contact to actual non-techy users. It's good sometimes to read usability papers or feedback from people who do not post on mailing lists or in forums.

For example:

Quoted from "To Toggle, or not to Toggle: The End of Torbutton"^[13]:

mike, am i completely anonymized if i log onto my facebook account? im using firefox 3.6 with tor and no script on windows 7 machine. thank you.

tor-dev First-time tails/tor user feedback
Eliminating Stop-Points in the Installation and Use of. Anonymity Systems
Quote, North Korea: On the net in world's most secretive nation (w): ""In order to make sure the mobile phone frequencies are not being tracked, I would fill up a washbasin with water and put the lid of a rice cooker over my head while I made a phone call," said one interviewee, a 28-year-old man who left the country in November 2010."

↑https://lists.torproject.org/pipermail/tor-dev/2012-April/003472.html
↑Tor Browser should set SOCKS username for a request based on referer
↑https://blog.torproject.org/blog/toggle-or-not-toggle-end-torbutton The Tor Blog
↑https://en.wikipedia.org/wiki/Deep_packet_inspection
↑https://www.torproject.org/docs/faq.html.en#ChooseEntryExit
↑ ^6.0^6.1^6.2This was originally posted by adrelanos (proper) to the TorifyHOWTO (w) (license) (w). Adrelanos didn't surrender any copyrights and can therefore re-use it here. It is under the same license as this DoNot page.
↑ ^7.0^7.1^7.2An identifier could be for example a (Flash) Cookie with an unique number.
↑Fingerprinting defense isn't perfect yet in any browser. There are still open bugs. See tbb-linkability and tbb-fingerprinting.
↑http://dee.su/liberte
↑Quote (w)
↑IMSI
↑IMEI
↑https://blog.torproject.org/blog/toggle-or-not-toggle-end-torbutton The Tor Blog

Attribution[edit]

Thanks to intrigeri and anonym, who provided feedback and suggestions for this page on the Tails-dev mailing list.

Random News:

Check out Whonix blog.

This is a wiki. Want to improve this page? Help welcome, volunteer contributions are happily considered! See Conditions for Contributions to Whonix, then Edit! IP addresses are scrubbed, but editing over Tor is recommended. Edits are held for moderation.

Whonix (g+) is a licensee of the Open Invention Network. Unless otherwise noted above, content of this page is copyrighted and licensed under the same Free (as in speech) license as Whonix itself.

↧

Untangling an Accounting Tool and an Ancient Incan Mystery

January 3, 2016, 9:44 am

≫ Next: Peter Naur: Programming as Theory Building (1985)

≪ Previous: Tor Anonymity: Things Not to Do

Patricia Landa, an archaeological conservator, painstakingly cleans and untangles the khipus at her house in Lima. William Neuman/The New York Times

LIMA, Peru â In a dry canyon strewn with the ruins of a long-dead city, archaeologists have made a discovery they hope will help unravel one of the most tenacious mysteries of ancient Peru: how to read the knotted string records, known as khipus, kept by the Incas.

At the site called Incahuasi, about 100 miles south of Lima, excavators have found, for the first time, several khipus in the place where they were used â in this case, a storage house for agricultural products where they appear to have been used as accounting books to record the amount of peanuts, chili peppers, beans, corn and other items that went in and out.

In some cases the khipus â the first ones were found at the site in 2013 â were buried under the remnants of centuries-old produce, which was preserved thanks to the extremely dry desert conditions.

Related Coverage

San CristÃ³bal De Rapaz Journal: High in the Andes, Guardians of an Inca MysteryAUG. 16, 2010

That was a blockbuster discovery because archaeologists had previously found khipus only in graves, where they were often buried with the scribes who created and used the devices. Many others are in the possession of collectors or museums, stripped of information relating to their provenance.

Khipus are made of a series of cotton or wool strings hanging from a main cord. Each string may have several knots, with the type and location of the knot conveying meaning. The color of the strands used to make the string and the way the strands are twisted together may also be part of the khipusâ system of storing and relaying information.

Researchers have long had a basic understanding of the numerical system incorporated in the khipus, where knots represent numbers and the relation between knots and strings can represent mathematical operations, like addition and subtraction.

But researchers have been unable to identify the meaning of any possible nonnumerical signifiers in khipus, and as a result they cannot read any nonmathematical words or phrases.

Now the Incahuasi researchers hope that by studying the khipus and comparing them with others in a large database, they may find that the khipus discovered with the peanuts contain a color, knot or other signifier for âpeanut.â The same goes for those found with chili peppers, beans and corn.

âWe can look at how the chili pepper khipu differs from the peanut khipu and from the corn khipu in terms of their color and other characteristics and we can build up a kind of sign vocabulary of how they were signifying this or that thing in their world,â said Gary Urton, a leading expert on khipus who is studying the new trove with Alejandro Chu, the archaeologist who led the excavation.

âItâs not the great Rosetta Stone but itâs quite an important new body of data to work with,â he said, adding, âItâs tremendously exciting.â

For now, the 29 khipus from Incahuasi, which are about 500 years old, are kept in an unassuming brick house in a residential neighborhood in Lima, along with a scattering of artifacts from other excavations, including two mummies (of a child and a dog), some bags of human bones, dozens of fragile textiles rolled up between layers of paper, and numerous pots meticulously reassembled from shards.

The house belongs to Patricia Landa, an archaeological conservator, who also keeps a menagerie of cats and dogs, including three hairless Peruvian dogs of the kind once raised by the Incas for food.

It is Ms. Landa who takes the Incahuasi khipus, some of which were found neatly rolled up and others in snarled jumbles, and painstakingly cleans and untangles them and prepares them for researchers to decipher.

âYou have a very special relationship with the material,â Ms. Landa, 59, said. âI talk to them. I say, âExcuse me for disturbing your rest but youâre helping us to understand your ancestors.âÂ â

Incahuasi, which means âhouse of the Inca emperor,â was a city used in the late 15th and early 16th centuries as the base of operations for the Inca invasion of Peruâs southern coast, after which it became a thriving administrative center, according to Mr. Chu, the archaeologist. It sat in the arid hills above the green valley of the CaÃ±ete River.

Khipus before it has been cleaned and untangled. William Neuman/The New York Times

âThere was probably lots of movement, with llama caravans bringing in farm produce,â he said.

The storehouse where the khipus were found was probably used to keep food needed to maintain the large number of troops deployed in the invasion.

The Incas, who were highly organized and governed a vast area, would have used khipus to keep track of provisions, and copies of the string records were probably sent to an administrative center, such as Cusco, the Inca capital, where they could be read, checked and perhaps filed. The Incahuasi excavation has even turned up what are essentially duplicate sets of khipus tied together, which the researchers believe could have been made when the same products were counted twice â perhaps to guarantee accurate bookkeeping.

One khipu found at the site had its knots untied, suggesting that the information stored there had been âerasedâ by the accountants so that the khipu could be reused, Ms. Landa said.

The khipus found at Incahuasi appear to be all about counting beans, literally. But colonial-era documents suggest that khipus had many uses in both the pre-Hispanic and colonial period that went beyond accounting, including to keep calendrical information and to tell historical narratives.

Colonial records show that in some cases, such as land disputes, indigenous litigants would bring khipus to court and use them to explain or justify claims of land ownership, Mr. Chu said. He said that scribes would read the khipus and a court clerk would enter the information into the trial record.

Mr. Urton has created a database of all known khipus, about 870 of them, with detailed information on two-thirds of them, recording their configurations, colors, numerical values and other information.

Because the Incahuasi khipus appear to be relatively simple inventories of agricultural products, it may be easier to decipher them than the more complex khipus that record historical information, Mr. Chu said.

And a breakthrough in deciphering the Incahuasi khipus could be a first step in reading more complex versions.

âIf we can find the connection between the khipu and the product that it was found with we can contribute to the deciphering of the khipus,â Mr. Chu said.

Mr. Urton said that the difference between the accounting khipus at Incahuasi and more elaborate khipus, âis the difference between, letâs say, your tax form and a novel.â But they may also have key similarities: âThey both use the same language, they both use the same numbers when they use numbers, and itâs in the same writing system.â

The excavations at Incahuasi have stopped because of a lack of financing. Much of the vast storeroom complex has yet to be excavated, and Mr. Chu hopes there could be more khipus there.

âIt was very exciting to find them,â Mr. Chu said. âWe started to find the storerooms and we didnât think we would find any khipus. Then we started to clear away the dirt and we saw the knots.â

A version of this article appears in print on January 3, 2016, on page A7 of the New York edition with the headline: Untangling an Ancient Accounting Tool and a Stubborn Incan Mystery .

↧

Peter Naur: Programming as Theory Building (1985)

January 4, 2016, 5:44 am

≫ Next: Up for Grabs: Projects which have curated tasks for new contributors

≪ Previous: Untangling an Accounting Tool and an Ancient Incan Mystery

Introduction

The present discussion is a contribution to the understanding of what programming is. It suggests that programming properly should be regarded as an activity by which the programmers form or achieve a certain kind of insight, a theory, of the matters at hand. This suggestion is in contrast to what appears to be a more common notion, that programming should be regarded as a production of a program and certain other texts.

Some of the background of the views presented here is to be found in certain observations of what actually happens to programs and the teams of programmers dealing with them, particularly in situations arising from unexpected and perhaps erroneous program executions or reactions, and on the occasion of modifications of programs. The difficulty of accommodating such observations in a production view of programming suggests that this view is misleading. The theory building view is presented as an alternative.

A more general background of the presentation is a conviction that it is important to have an appropriate understanding of what programming is. If our understanding is inappropriate we will misunderstand the difficulties that arise in the activity and our attempts to overcome them will give rise to conflicts and frustrations.

In the present discussion some of the crucial background experience will first be outlined. This is followed by an explanation of a theory of what programming is, denoted the Theory Building View. The subsequent sections enter into some of the consequences of the Theory Building View.

Programming and the Programmers’ Knowledge

I shall use the word programming to denote the whole activity of design and implementation of programmed solutions. What I am concerned with is the activity of matching some significant part and aspect of an activity in the real world to the formal symbol manipulation that can be done by a program running on a computer. With such a notion it follows directly that the programming activity I am talking about must include the development in time corresponding to the changes taking place in the real world activity being matched by the program execution, in other words program modifications.

One way of stating the main point I want to make is that programming in this sense primarily must be the programmers’ building up knowledge of a certain kind, knowledge taken to be basically the programmers’ immediate possession, any documentation being an auxiliary, secondary product.

As a background of the further elaboration of this view given in the following sections, the remainder of the present section will describe some real experience of dealing with large programs that has seemed to me more and more significant as I have pondered over the problems. In either case the experience is my own or has been communicated to me by persons having first hand contact with the activity in question.

Case 1 concerns a compiler. It has been developed by a group A for a Language L and worked very well on computer X. Now another group B has the task to write a compiler for a language L + M, a modest extension of L, for computer Y. Group B decides that the compiler for L developed by group A will be a good starting point for their design, and get a contract with group A that they will get support in the form of full documentation, including annotated program texts and much additional written design discussion, and also personal advice. The arrangement was effective and group B managed to develop the compiler they wanted. In the present context the significant issue is the importance of the personal advice from group A in the matters that concerned how to implement the extensions M to the language. During the design phase group B made suggestions for the manner in which the extensions should be accommodated and submitted them to group A for review. In several major cases it turned out that the solutions suggested by group B were found by group A to make no use of the facilities that were not only inherent in the structure of the existing compiler but were discussed at length in its documentation, and to be based instead on additions to that structure in the form of patches that effectively destroyed its power and simplicity. The members of group A were able to spot these cases instantly and could propose simple and effective solutions, framed entirely within the existing structure. This is an example of how the full program text and additional documentation is insufficient in conveying to even the highly motivated group B the deeper insight into the design, that theory which is immediately present to the members of group A.

In the years following these events the compiler developed by group B was taken over by other programmers of the same organization, without guidance from group A. Information obtained by a member of group A about the compiler resulting from the further modification of it after about 10 years made it clear that at that later stage the original powerful structure was still visible, but made entirely ineffective by amorphous additions of many different kinds. Thus, again, the program text and its documentation has proved insufficient as a carrier of some of the most important design ideas.

Case 2 concerns the installation and fault diagnosis of a large real–time system for monitoring industrial production activities. The system is marketed by its producer, each delivery of the system being adapted individually to its specific environment of sensors and display devices. The size of the program delivered in each installation is of the order of 200,000 lines. The relevant experience from the way this kind of system is handled concerns the role and manner of work of the group of installation and fault finding programmers. The facts are, first that these programmers have been closely concerned with the system as a full time occupation over a period of several years, from the time the system was under design. Second, when diagnosing a fault these programmers rely almost exclusively on their ready knowledge of the system and the annotated program text, and are unable to conceive of any kind of additional documentation that would be useful to them. Third, other programmers’ groups who are responsible for the operation of particular installations of the system, and thus receive documentation of the system and full guidance on its use from the producer’s staff, regularly encounter difficulties that upon consultation with the producer’s installation and fault finding programmer are traced to inadequate understanding of the existing documentation, but which can be cleared up easily by the installation and fault finding programmers.

The conclusion seems inescapable that at least with certain kinds of large programs, the continued adaption, modification, and correction of errors in them, is essentially dependent on a certain kind of knowledge possessed by a group of programmers who are closely and continuously connected with them.

Ryle’s Notion of Theory

If it is granted that programming must involve, as the essential part, a building up of the programmers’ knowledge, the next issue is to characterize that knowledge more closely. What will be considered here is the suggestion that the programmers’ knowledge properly should be regarded as a theory, in the sense of Ryle [1949]. Very briefly, a person who has or possesses a theory in this sense knows how to do certain things and in addition can support the actual doing with explanations, justifications, and answers to queries, about the activity of concern. It may be noted that Ryle’s notion of theory appears as an example of what K. Popper [Popper, and Eccles, 1977] calls unembodied World 3 objects and thus has a defensible philosophical standing. In the present section we shall describe Ryle’s notion of theory in more detail.

Ryle [1949] develops his notion of theory as part of his analysis of the nature of intellectual activity, particularly the manner in which intellectual activity differs from, and goes beyond, activity that is merely intelligent. In intelligent behaviour the person displays, not any particular knowledge of facts, but the ability to do certain things, such as to make and appreciate jokes, to talk grammatically, or to fish. More particularly, the intelligent performance is characterized in part by the person’s doing them well, according to certain criteria, but further displays the person’s ability to apply the criteria so as to detect and correct lapses, to learn from the examples of others, and so forth. It may be noted that this notion of intelligence does not rely on any notion that the intelligent behaviour depends on the person’s following or adhering to rules, prescriptions, or methods. On the contrary, the very act of adhering to rules can be done more or less intelligently; if the exercise of intelligence depended on following rules there would have to be rules about how to follow rules, and about how to follow the rules about following rules, etc. in an infinite regress, which is absurd.

What characterizes intellectual activity, over and beyond activity that is merely intelligent, is the person’s building and having a theory, where theory is understood as the knowledge a person must have in order not only to do certain things intelligently but also to explain them, to answer queries about them, to argue about them, and so forth. A person who has a theory is prepared to enter into such activities; while building the theory the person is trying to get it.

The notion of theory in the sense used here applies not only to the elaborate constructions of specialized fields of enquiry, but equally to activities that any person who has received education will participate in on certain occasions. Even quite unambitious activities of everyday life may give rise to people’s theorizing, for example in planning how to place furniture or how to get to some place by means of certain means of transportation.

The notion of theory employed here is explicitly not confined to what may be called the most general or abstract part of the insight. For example, to have Newton’s theory of mechanics as understood here it is not enough to understand the central laws, such as that force equals mass times acceleration. In addition, as described in more detail by Kuhn [1970, p. 187ff], the person having the theory must have an understanding of the manner in which the central laws apply to certain aspects of reality, so as to be able to recognize and apply the theory to other similar aspects. A person having Newton’s theory of mechanics must thus understand how it applies to the motions of pendulums and the planets, and must be able to recognize similar phenomena in the world, so as to be able to employ the mathematically expressed rules of the theory properly.

The dependence of a theory on a grasp of certain kinds of similarity between situations and events of the real world gives the reason why the knowledge held by someone who has the theory could not, in principle, be expressed in terms of rules. In fact, the similarities in question are not, and cannot be, expressed in terms of criteria, no more than the similarities of many other kinds of objects, such as human faces, tunes, or tastes of wine, can be thus expressed.

The Theory To Be Built by the Programmer

In terms of Ryle’s notion of theory, what has to be built by the programmer is a theory of how certain affairs of the world will be handled by, or supported by, a computer program. On the Theory Building View of programming the theory built by the programmers has primacy over such other products as program texts, user documentation, and additional documentation such as specifications.

In arguing for the Theory Building View, the basic issue is to show how the knowledge possessed by the programmer by virtue of his or her having the theory necessarily, and in an essential manner, transcends that which is recorded in the documented products. The answers to this issue is that the programmer’s knowledge transcends that given in documentation in at least three essential areas:

1) The programmer having the theory of the program can explain how the solution relates to the affairs of the world that it helps to handle. Such an explanation will have to be concerned with the manner in which the affairs of the world, both in their overall characteristics and their details, are, in some sense, mapped into the program text and into any additional documentation. Thus the programmer must be able to explain, for each part of the program text and for each of its overall structural characteristics, what aspect or activity of the world is matched by it. Conversely, for any aspect or activity of the world the programmer is able to state its manner of mapping into the program text. By far the largest part of the world aspects and activities will of course lie outside the scope of the program text, being irrelevant in the context. However, the decision that a part of the world is relevant can only be made by someone who understands the whole world. This understanding must be contributed by the programmer.

2) The programmer having the theory of the program can explain why each part of the program is what it is, in other words is able to support the actual program text with a justification of some sort. The final basis of the justification is and must always remain the programmer’s direct, intuitive knowledge or estimate. This holds even where the justification makes use of reasoning, perhaps with application of design rules, quantitative estimates, comparisons with alternatives, and such like, the point being that the choice of the principles and rules, and the decision that they are relevant to the situation at hand, again must in the final analysis remain a matter of the programmer’s direct knowledge.

3) The programmer having the theory of the program is able to respond constructively to any demand for a modification of the program so as to support the affairs of the world in a new manner. Designing how a modification is best incorporated into an established program depends on the perception of the similarity of the new demand with the operational facilities already built into the program. The kind of similarity that has to be perceived is one between aspects of the world. It only makes sense to the agent who has knowledge of the world, that is to the programmer, and cannot be reduced to any limited set of criteria or rules, for reasons similar to the ones given above why the justification of the program cannot be thus reduced.

While the discussion of the present section presents some basic arguments for adopting the Theory Building View of programming, an assessment of the view should take into account to what extent it may contribute to a coherent understanding of programming and its problems. Such matters will be discussed in the following sections.

Problems and Costs of Program Modifications

A prominent reason for proposing the Theory Building View of programming is the desire to establish an insight into programming suitable for supporting a sound understanding of program modifications. This question will therefore be the first one to be taken up for analysis.

One thing seems to be agreed by everyone, that software will be modified. It is invariably the case that a program, once in operation, will be felt to be only part of the answer to the problems at hand. Also the very use of the program itself will inspire ideas for further useful services that the program ought to provide. Hence the need for ways to handle modifications.

The question of program modifications is closely tied to that of programming costs. In the face of a need for a changed manner of operation of the program, one hopes to achieve a saving of costs by making modifications of an existing program text, rather than by writing an entirely new program.

The expectation that program modifications at low cost ought to be possible is one that calls for closer analysis. First it should be noted that such an expectation cannot be supported by analogy with modifications of other complicated man–made constructions. Where modifications are occasionally put into action, for example in the case of buildings, they are well known to be expensive and in fact complete demolition of the existing building followed by new construction is often found to be preferable economically. Second, the expectation of the possibility of low cost program modifications conceivably finds support in the fact that a program is a text held in a medium allowing for easy editing. For this support to be valid it must clearly be assumed that the dominating cost is one of text manipulation. This would agree with a notion of programming as text production. On the Theory Building View this whole argument is false. This view gives no support to an expectation that program modifications at low cost are generally possible.

A further closely related issue is that of program flexibility. In including flexibility in a program we build into the program certain operational facilities that are not immediately demanded, but which are likely to turn out to be useful. Thus a flexible program is able to handle certain classes of changes of external circumstances without being modified.

It is often stated that programs should be designed to include a lot of flexibility, so as to be readily adaptable to changing circumstances. Such advice may be reasonable as far as flexibility that can be easily achieved is concerned. However, flexibility can in general only be achieved at a substantial cost. Each item of it has to be designed, including what circumstances it has to cover and by what kind of parameters it should be controlled. Then it has to be implemented, tested, and described. This cost is incurred in achieving a program feature whose usefulness depends entirely on future events. It must be obvious that built–in program flexibility is no answer to the general demand for adapting programs to the changing circumstances of the world.

In a program modification an existing programmed solution has to be changed so as to cater for a change in the real world activity it has to match. What is needed in a modification, first of all, is a confrontation of the existing solution with the demands called for by the desired modification. In this confrontation the degree and kind of similarity between the capabilities of the existing solution and the new demands has to be determined. This need for a determination of similarity brings out the merit of the Theory Building View. Indeed, precisely in a determination of similarity the shortcoming of any view of programming that ignores the central requirement for the direct participation of persons who possess the appropriate insight becomes evident. The point is that the kind of similarity that has to be recognized is accessible to the human beings who possess the theory of the program, although entirely outside the reach of what can be determined by rules, since even the criteria on which to judge it cannot be formulated. From the insight into the similarity between the new requirements and those already satisfied by the program, the programmer is able to design the change of the program text needed to implement the modification.

In a certain sense there can be no question of a theory modification, only of a program modification. Indeed, a person having the theory must already be prepared to respond to the kinds of questions and demands that may give rise to program modifications. This observation leads to the important conclusion that the problems of program modification arise from acting on the assumption that programming consists of program text production, instead of recognizing programming as an activity of theory building.

On the basis of the Theory Building View the decay of a program text as a result of modifications made by programmers without a proper grasp of the underlying theory becomes understandable. As a matter of fact, if viewed merely as a change of the program text and of the external behaviour of the execution, a given desired modification may usually be realized in many different ways, all correct. At the same time, if viewed in relation to the theory of the program these ways may look very different, some of them perhaps conforming to that theory or extending it in a natural way, while others may be wholly inconsistent with that theory, perhaps having the character of unintegrated patches on the main part of the program. This difference of character of various changes is one that can only make sense to the programmer who possesses the theory of the program. At the same time the character of changes made in a program text is vital to the longer term viability of the program. For a program to retain its quality it is mandatory that each modification is firmly grounded in the theory of it. Indeed, the very notion of qualities such as simplicity and good structure can only be understood in terms of the theory of the program, since they characterize the actual program text in relation to such program texts that might have been written to achieve the same execution behaviour, but which exist only as possibilities in the programmer’s understanding.

Program Life, Death, and Revival

A main claim of the Theory Building View of programming is that an essential part of any program, the theory of it, is something that could not conceivably be expressed, but is inextricably bound to human beings. It follows that in describing the state of the program it is important to indicate the extent to which programmers having its theory remain in charge of it. As a way in which to emphasize this circumstance one might extend the notion of program building by notions of program life, death, and revival. The building of the program is the same as the building of the theory of it by and in the team of programmers. During the program life a programmer team possessing its theory remains in active control of the program, and in particular retains control over all modifications. The death of a program happens when the programmer team possessing its theory is dissolved. A dead program may continue to be used for execution in a computer and to produce useful results. The actual state of death becomes visible when demands for modifications of the program cannot be intelligently answered. Revival of a program is the rebuilding of its theory by a new programmer team.

The extended life of a program according to these notions depends on the taking over by new generations of programmers of the theory of the program. For a new programmer to come to possess an existing theory of a program it is insufficient that he or she has the opportunity to become familiar with the program text and other documentation. What is required is that the new programmer has the opportunity to work in close contact with the programmers who already possess the theory, so as to be able to become familiar with the place of the program in the wider context of the relevant real world situations and so as to acquire the knowledge of how the program works and how unusual program reactions and program modifications are handled within the program theory. This problem of education of new programmers in an existing theory of a program is quite similar to that of the educational problem of other activities where the knowledge of how to do certain things dominates over the knowledge that certain things are the case, such as writing and playing a music instrument. The most important educational activity is the student’s doing the relevant things under suitable supervision and guidance. In the case of programming the activity should include discussions of the relation between the program and the relevant aspects and activities of the real world, and of the limits set on the real world matters dealt with by the program.

A very important consequence of the Theory Building View is that program revival, that is reestablishing the theory of a program merely from the documentation, is strictly impossible. Lest this consequence may seem unreasonable it may be noted that the need for revival of an entirely dead program probably will rarely arise, since it is hardly conceivable that the revival would be assigned to new programmers without at least some knowledge of the theory had by the original team. Even so the Theory Building View suggests strongly that program revival should only be attempted in exceptional situations and with full awareness that it is at best costly, and may lead to a revived theory that differs from the one originally had by the program authors and so may contain discrepancies with the program text.

In preference to program revival, the Theory Building View suggests, the existing program text should be discarded and the new–formed programmer team should be given the opportunity to solve the given problem afresh. Such a procedure is more likely to produce a viable program than program revival, and at no higher, and possibly lower, cost. The point is that building a theory to fit and support an existing program text is a difficult, frustrating, and time consuming activity. The new programmer is likely to feel torn between loyalty to the existing program text, with whatever obscurities and weaknesses it may contain, and the new theory that he or she has to build up, and which, for better or worse, most likely will differ from the original theory behind the program text.

Similar problems are likely to arise even when a program is kept continuously alive by an evolving team of programmers, as a result of the differences of competence and background experience of the individual programmers, particularly as the team is being kept operational by inevitable replacements of the individual members.

Method and Theory Building

Recent years has seen much interest in programming methods. In the present section some comments will be made on the relation between the Theory Building View and the notions behind programming methods.

To begin with, what is a programming method? This is not always made clear, even by authors who recommend a particular method. Here a programming method will be taken to be a set of work rules for programmers, telling what kind of things the programmers should do, in what order, which notations or languages to use, and what kinds of documents to produce at various stages.

In comparing this notion of method with the Theory Building View of programming, the most important issue is that of actions or operations and their ordering. A method implies a claim that program development can and should proceed as a sequence of actions of certain kinds, each action leading to a particular kind of documented result. In building the theory there can be no particular sequence of actions, for the reason that a theory held by a person has no inherent division into parts and no inherent ordering. Rather, the person possessing a theory will be able to produce presentations of various sorts on the basis of it, in response to questions or demands.

As to the use of particular kinds of notation or formalization, again this can only be a secondary issue since the primary item, the theory, is not, and cannot be, expressed, and so no question of the form of its expression arises.

It follows that on the Theory Building View, for the primary activity of the programming there can be no right method.

This conclusion may seem to conflict with established opinion, in several ways, and might thus be taken to be an argument against the Theory Building View. Two such apparent contradictions shall be taken up here, the first relating to the importance of method in the pursuit of science, the second concerning the success of methods as actually used in software development.

The first argument is that software development should be based on scientific manners, and so should employ procedures similar to scientific methods. The flaw of this argument is the assumption that there is such a thing as scientific method and that it is helpful to scientists. This question has been the subject of much debate in recent years, and the conclusion of such authors as Feyerabend [1978], taking his illustrations from the history of physics, and Medawar [1982], arguing as a biologist, is that the notion of scientific method as a set of guidelines for the practising scientist is mistaken.

This conclusion is not contradicted by such work as that of Polya [1954, 1957] on problem solving. This work takes its illustrations from the field of mathematics and leads to insight which is also highly relevant to programming. However, it cannot be claimed to present a method on which to proceed. Rather, it is a collection of suggestions aiming at stimulating the mental activity of the problem solver, by pointing out different modes of work that may be applied in any sequence.

The second argument that may seem to contradict the dismissal of method of the Theory Building View is that the use of particular methods has been successful, according to published reports. To this argument it may be answered that a methodically satisfactory study of the efficacy of programming methods so far never seems to have been made. Such a study would have to employ the well established technique of controlled experiments (cf. [Brooks, 1980] or [Moher and Schneider, 1982]). The lack of such studies is explainable partly by the high cost that would undoubtedly be incurred in such investigations if the results were to be significant, partly by the problems of establishing in an operational fashion the concepts underlying what is called methods in the field of program development. Most published reports on such methods merely describe and recommend certain techniques and procedures, without establishing their usefulness or efficacy in any systematic way. An elaborate study of five different methods by C. Floyd and several co–workers [Floyd, 1984] concludes that the notion of methods as systems of rules that in an arbitrary context and mechanically will lead to good solutions is an illusion. What remains is the effect of methods in the education of programmers. This conclusion is entirely compatible with the Theory Building View of programming. Indeed, on this view the quality of the theory built by the programmer will depend to a large extent on the programmer’s familiarity with model solutions of typical problems, with techniques of description and verification, and with principles of structuring systems consisting of many parts in complicated interactions. Thus many of the items of concern of methods are relevant to theory building. Where the Theory Building View departs from that of the methodologists is on the question of which techniques to use and in what order. On the Theory Building View this must remain entirely a matter for the programmer to decide, taking into account the actual problem to be solved.

Programmers’ Status and the Theory Building View

The areas where the consequences of the Theory Building View contrast most strikingly with those of the more prevalent current views are those of the programmers’ personal contribution to the activity and of the programmers’ proper status.

The contrast between the Theory Building View and the more prevalent view of the programmers’ personal contribution is apparent in much of the common discussion of programming. As just one example, consider the study of modifiability of large software systems by Oskarsson [1982]. This study gives extensive information on a considerable number of modifications in one release of a large commercial system. The description covers the background, substance, and implementation, of each modification, with particular attention to the manner in which the program changes are confined to particular program modules. However, there is no suggestion whatsoever that the implementation of the modifications might depend on the background of the 500 programmers employed on the project, such as the length of time they have been working on it, and there is no indication of the manner in which the design decisions are distributed among the 500 programmers. Even so the significance of an underlying theory is admitted indirectly in statements such as that ‘decisions were implemented in the wrong block’ and in a reference to ‘a philosophy of AXE’. However, by the manner in which the study is conducted these admissions can only remain isolated indications.

More generally, much current discussion of programming seems to assume that programming is similar to industrial production, the programmer being regarded as a component of that production, a component that has to be controlled by rules of procedure and which can be replaced easily. Another related view is that human beings perform best if they act like machines, by following rules, with a consequent stress on formal modes of expression, which make it possible to formulate certain arguments in terms of rules of formal manipulation. Such views agree well with the notion, seemingly common among persons working with computers, that the human mind works like a computer. At the level of industrial management these views support treating programmers as workers of fairly low responsibility, and only brief education.

On the Theory Building View the primary result of the programming activity is the theory held by the programmers. Since this theory by its very nature is part of the mental possession of each programmer, it follows that the notion of the programmer as an easily replaceable component in the program production activity has to be abandoned. Instead the programmer must be regarded as a responsible developer and manager of the activity in which the computer is a part. In order to fill this position he or she must be given a permanent position, of a status similar to that of other professionals, such as engineers and lawyers, whose active contributions as employers of enterprises rest on their intellectual proficiency.

The raising of the status of programmers suggested by the Theory Building View will have to be supported by a corresponding reorientation of the programmer education. While skills such as the mastery of notations, data representations, and data processes, remain important, the primary emphasis would have to turn in the direction of furthering the understanding and talent for theory formation. To what extent this can be taught at all must remain an open question. The most hopeful approach would be to have the student work on concrete problems under guidance, in an active and constructive environment.

Conclusions

Accepting program modifications demanded by changing external circumstances to be an essential part of programming, it is argued that the primary aim of programming is to have the programmers build a theory of the way the matters at hand may be supported by the execution of a program. Such a view leads to a notion of program life that depends on the continued support of the program by programmers having its theory. Further, on this view the notion of a programming method, understood as a set of rules of procedure to be followed by the programmer, is based on invalid assumptions and so has to be rejected. As further consequences of the view, programmers have to be accorded the status of responsible, permanent developers and managers of the activity of which the computer is a part, and their education has to emphasize the exercise of theory building, side by side with the acquisition of knowledge of data processing and notations.

References

Brooks, R. E. Studying programmer behaviour experimentally. Comm. ACM 23(4): 207–213, 1980.

Feyerabend, P. Against Method. London, Verso Editions, 1978; ISBN: 86091–700–2.

Floyd, C. Eine Untersuchung von Software–Entwicklungs–Methoden. Pp. 248–274 in Programmierumgebungen und Compiler, ed H. Morgenbrod and W. Sammer, Tagung I/1984 des German Chapter of the ACM, Stuttgart, Teubner Verlag, 1984; ISBN: 3–519–02437–3.

Kuhn, T.S. The Structure of Scientific Revolutions, Second Edition. Chicago, University of Chicago Press, 1970; ISBN: 0–226–45803–2.

Medawar, P. Pluto’s Republic. Oxford, University Press, 1982: ISBN: 0–19–217726–5.

Moher, T., and Schneider, G. M. Methodology and experimental research in software engineering, Int. J. Man–Mach. Stud. 16: 65–87, 1. Jan. 1982.

Oskarsson, Ö Mechanisms of modifiability in large software systems Linköping Studies in Science and Technology, Dissertations, no. 77, Linköping, 1982; ISBN: 91–7372–527–7.

Polya, G. How To Solve It . New York, Doubleday Anchor Book, 1957.

Polya, G. Mathematics and Plausible Reasoning. New Jersey, Princeton University Press, 1954.

Popper, K. R., and Eccles, J. C. The Self and Its Brain. London, Routledge and Kegan Paul, 1977.

Ryle, G. The Concept of Mind. Harmondsworth, England, Penguin, 1963, first published 1949. Applying "Theory Building"

↧

Up for Grabs: Projects which have curated tasks for new contributors

January 3, 2016, 6:03 am

≫ Next: Agromafia

≪ Previous: Peter Naur: Programming as Theory Building (1985)

This is a list of projects which have curated tasks specifically for new contributors. These are a great way to get started with a project, or to help share the load of working on open source projects.

Find a project you'd like to get involved with:

read the contributor guidelines for the project
get the project running locally
leave a message on a task you'd like to work on
get to work!

We're looking for projects who can take the time out to help mentor developers as they get started with open source.

What sort of tasks are a good fit?

Tasks should take no longer than a few nights' worth of work
Tasks should stand alone - avoid core functionality on which other tasks depend
Tasks should be well described with pointers to help the implementer

We suggest the tag up-for-grabs but using a different name is also acceptable.

If this sounds like you, getting involved is simple:

Tag bugs and feature requests that would be a good place to start
Grab the URL to this list of tasks so that others can get to it easily
Sign in to GitHub and check out the README.

↧

Agromafia

January 3, 2016, 7:28 pm

≫ Next: IPv6 celebrates its 20th birthday by reaching 10% deployment

≪ Previous: Up for Grabs: Projects which have curated tasks for new contributors

In Italy, Bill Whitaker finds out that the long arms of the Mafia extend to agricultural products, especially olive oil, on which the mob makes huge profits by exporting imitations

The following is a script from "Agromafia" which aired on Jan. 3, 2016. Bill Whitaker is the correspondent. Guy Campanile, producer.

When it comes to knock-offs of Italian classics -- you probably think of fake Guccis or Pradas -- not food.

But last month, police in Italy nabbed 7,000 tons of phony olive oil. Much of it was bound for American stores. The oil was from North Africa, deodorized with chemicals and rebranded as more expensive Italian extra virgin. The scam was cooked-up by organized crime.

PlayVideo

60 Minutes Overtime

How to buy olive oil

Minutes producer Guy Campanile offers shoppers some tips on finding true Italian extra-virgin olive oil amid a sea of fakes.

Mafia copies of fine olive oil, wine and cheese have fueled an explosion of food crime in Italy. It's estimated to be a $16 billion-a-year enterprise. The Italians call it "Agromafia"...and it's a scandal for a people whose cuisine is considered a national treasure.

The image of gangsters in the kitchen was too delicious for us to ignore. So we went to Italy, where we found elite food police hunting wiseguys and signs Agromafia specialties are reaching the United States.

Leave it to the Italians to fight the Mafia with good taste. This panel certifies the authenticity of extra virgin olive oil - a favorite target of the Agromafia.

They can tell at first sip whether extra virgin has been diluted with cheap sunflower oil or canola.

Bill Whitaker and Sergio Tillo

CBS News

Bill Whitaker: Sergio, why do they make that sound like they are sucking in air?

Sergio Tirro: They need it to mist it in the back of their throats..

Bill Whitaker: They have to suck it in the back of their throats..

Sergio Tirro: They have to suck it in...

Major Sergio Tirro is considered one of the top investigators of food fraud in Europe. Think Elliot Ness - in a uniform designed by Giorgio Armani.

Sergio Tirro: Most of the fraud has been discovered with the expertise like this...

Their skill is so respected, Italian courts will accept taste results as evidence. Tirro has 60 cops trained to do this too and 1,100 more conducting inspections and fraud investigations. On the day we visited headquarters, officers were monitoring wiretaps and live video from hidden cameras placed in suspected warehouses around Italy.

Bill Whitaker: This looks like the FBI...

Sergio Tirro: Yes. We can call ourselves the FBI of food.

In the last two years, they have seized 59,000 tons of food. The Agromafia's ingredients are poor quality and sometimes contaminated with solvents or pesticides.

PlayVideo

60 Minutes Overtime

The "FBI of food"

A panel that certifies the authenticity of extra virgin olive oil – a favorite target of the "agromafia"– can tell at first sip whether it's bee...

Bill Whitaker: When I tell somebody that I'm coming to Italy to do a piece about food fraud -- it almost seems unbelievable.

Sergio Tirro: It is a serious problem because it's not only a commercial fraud...if you adulterate an extra virgin olive oil with seed oil and those bottles reach consumers who are allergic to seed oil, you are sending them bombs.

Bill Whitaker: Bombs...on your kitchen shelf.

Sergio Tirro: Yes.

The Agromafia has also tried to rip-off Italian shoppers with mozzarella whitened with detergent and rotten seafood deodorized with citric acid.

Bill Whitaker: My favorite, Italian wines, how are they adulterated?

Sergio Tirro: They generally use, to mix, poor quality wine and brand it as famous wine.

Bill Whitaker: So you take a cheap table wine and just put a famous stamp on it?

Sergio Tirro: Yes.

Bill Whitaker: And sell it?

Sergio Tirro: Yes.

In Tuscany, cops found 42,000 gallons of run-of-the-mill red that was to going to be sold as top-notch Brunello Di Montalcino. The score could have been $5 million.

Bill Whitaker: So this is everything. Olive oil, tomatoes...

Tom Mueller: Milk, butter, bread, a wide range of different foods.

Journalist Tom Mueller has lived in Italy for 20 years and speaks routinely with investigators and food producers.

Bill Whitaker: So where along the food chain does the Mafia get involved?

Tom Mueller: From harvesting...they impose their own workers, they impose prices...to the transportation and there's involvement in-- Mafia involvement in supermarkets as well. So certain areas, they have really infiltrated the entire food chain from the farm to the fork.

Mueller first wrote about olive oil fraud in 2007 for the New Yorker Magazine.

Tom Mueller: You in many cases are getting lower grade olive oil that has been blended with some good extra virgin olive oil...you're sometimes getting deodorized oil. They blend it with some oil that has some character to give it a little color, a little flavor...and they sell that as extra virgin. It's illegal - it happens all the time.

Extra virgin must come from the first press of olives and be free of additives. It's fruity, aromatic and has a spicy finish. The best can sell for $50-a-gallon...but a fake costs just seven dollars to make. The profit margin can be three times better than cocaine.

Sergio Tirro: I would like to show you how easy is to make a genuine fake extra virgin...

Bill Whitaker: Genuine, fake extra virgin olive oil?

Sergio Tirro: Genuine fake extra virgin olive oil...you just need some seed oil.

Bill Whitaker: What kind of seed oil?

Sergio Tirro: It is sunflower oil, no smell at all.

Bill Whitaker: None.

Sergio Tirro: Then we just have to add a few drops of chlorophyll.

Bill Whitaker: For color?

Sergio Tirro: For color.

Bill Whitaker: ...and it becomes the color of olive oil.

Sergio Tirro: It becomes the color of olive oil.

Eighty percent of Italy's extra virgin comes from the southern part of the country. So we went to Sicily - where the Mafia remains part of daily life in the streets and in the fields. Nicola Clemenza's olive grove is a 90-minute drive south of Palermo. We went to see him because Clemenza is leading a farmer revolt against Mafia control. His olives are hand-combed from the trees onto nets below and immediately sent to be pressed.

Bill Whitaker: Nicola, what role does the Mafia play in olive oil production here?

Nicola Clemenza: "Ma, il ruolo che ha la Mafia nella produzione di olio...

Clemenza told us the Agromafia dilutes the oil and controls prices. He's defied the mob by organizing 200 farmers to skip the Mafia middle men and sell their oil directly to distributors.

Bill Whitaker: When you organized the farmers, the Mafia retaliated against you?

Nicola Clemenza, translator: On the day I started the consortium, they burned my car, they burned down part of my home and I was inside with my wife and my daughter.

Bill Whitaker: They tried to kill you...

No - he said it was a message to stay quiet. This is a police image of the man Clemenza believes ordered the attack. He is Matteo Messina Denaro -- the boss of bosses for the Cosa Nostra. Many believe he's hiding out in the town not far from Clemenza's fields. Denaro built a $41 million olive oil empire.

Tom Mueller: It's very difficult to say in any given case with olive oil exactly how many drops in a given bottle actually have Mafia blood on them to sound dramatic. It is fairly straightforward to say, however, just how much fraudulent oil is in circulation--

Bill Whitaker: How much?

Tom Mueller: Easily half of the bottles that are sold as extra virgin in supermarkets in Italy do not meet the legal grades for extra virgin oil.

Bill Whitaker: So half here in Italy, what would it be in the U.S.?

Tom Mueller: Up around 75% to 80%, easily.

Yes, you heard right - he said up to 80 percent.

Food imported into the United States is inspected by Customs and Border Protection. Its New Jersey chemists told us they have detected phony oil imported from Italy improperly labeled as extra virgin.

We were curious about what we'd find in a U.S. supermarket. So we shipped three brands of Italian extra virgin we purchased in New York back to the mother country.

They were included in a blind taste test by those experts in Rome. The process is as tightly orchestrated as a Verdi opera. Blue glass hides the oil's color. Separate cubicles prevent cheating.

The panel would not say they were adulterated - but they agreed two brands we purchased back home did not come within a sniff of extra virgin. They described one as lampante -- the lowest quality olive oil. That brand happens to be one of the best-selling in America.

Sergio Tirro: It's not that bad...

Bill Whitaker: It's not that bad...

Sergio Tirro: Not that bad...but maybe for..

Bill Whitaker: Not that good either?

Sergio Tirro: No -- not for my salads. I would never put this on my salad.

Chances are that salad was picked by migrants controlled by the Agromafia too and served in one of Italy's 5,000 mob-owned restaurants. Last spring, these two tourist spots in Rome -- were temporarily closed for alleged Mafia ties.

And the food businesses not run by gangsters -- often pay them anyway. The extortion is called "pizzo." Refuse -- and you risk broken windows - or worse.

Bill Whitaker: What percentage of the merchants here are paying the pizzo, protection money to the Mafia?

Ermes Riccobono: Actually we cannot know for sure, we could say that a big part of this.

Bill Whitaker: Most of them.

Ermes Riccobono: Most of them, we could say yes.

Ermes Riccobono took us around one of the oldest food markets in Palermo. He works with a group called Addiopizzo, which means "farewell pizzo." It's enlisted 800 stores and restaurants to stop paying the Mafia.

Ermes Riccobono: They've been doing this for so long, generation by generation, that it's normal for them. It's not even a problem.

Bill Whitaker: How much would they be asking of these merchants?

Ermes Riccobono: Might be, I don't know -- 500 Euros, $500 a month or even $5 per week according to the size of the shop.

Add it up and extortion costs Italy at least $6 billion a year.

Bill Whitaker: What makes you think that your young organization is going to stop this?

Ermes Riccobono: Well, it's what we need to do. I mean it's our moral obligation. We are a young generation. And we need to fight.

We were told we could see how the fight has taken root just a short drive from Corleone -- the town made famous by "The Godfather." Over the last decade, cops have taken 3,500 acres away from Mafia owners and given them to the group "Libera Terra" (or "Free Land." These fields confiscated from the mob have created a booming business for farmers.

Pietro D'Aleo: We have around 80 food and beverage products made with raw materials coming from this land.

Bill Whitaker: Eighty?

Pietro D'Aleo: Eighty.

Marketing manager Pietro D'Aleo gave us a taste of their success.

Bill Whitaker: Smells good.

Pietro D'Aleo: Yes.

A wine called Centopassi that's drawn raves from critics.

Bill Whitaker: Smells delicious. Very well balanced - cheers. Thank you.

Libera Terra products are sold in shops across Italy. It turns out "Mafia-free" is a hot seller -- especially if the food is world class.

"They don't look like the olives on your plate..."

Nicola Clemenza hopes he can break the Agromafia's grip where he lives by exporting directly to American customers.

Bill Whitaker: What is your greatest fear now?

Nicola Clemenza, translater: No, I 'm not scared anymore, because fear has turned into anger, it's turned into courage, it's turned into action and now all my free time is dedicated to fight the Mafia...to fight the Mafia with the truth.

He figures his best weapon -- is the liquid gold that tastes as rich as it looks.

Bill Whitaker: You can feel it going all the way down...

Nicola Clemenza: Strong?

Bill Whitaker: Strong!

↧

IPv6 celebrates its 20th birthday by reaching 10% deployment

January 3, 2016, 9:23 am

≫ Next: The Surreal Story of StubHub Screwing Over a Kobe Fan

≪ Previous: Agromafia

Twenty years ago this month, RFC 1883 was published: Internet Protocol, Version 6 (IPv6) Specification. So what's an Internet Protocol, and what's wrong with the previous five versions? And if version 6 is so great, why has it only been adopted by half a percent of the Internet's users each year over the past two decades?

10 percent!

First the good news. According to Google's statistics, on December 26, the world reached 9.98 percent IPv6 deployment, up from just under 6 percent a year earlier. Google measures IPv6 deployment by having a small fraction of their users execute a Javascript program that tests whether the computer in question can load URLs over IPv6. During weekends, a tenth of Google's users are able to do this, but during weekdays it's less than 8 percent. Apparently more people have IPv6 available at home than at work.

Enlarge/ World-wide IPv6 deployment as measured by Google.

Google also keeps a map of the world with IPv6 deployment numbers per country, handily color-coded for our convenience. More and more countries are turning green, with the US at nearly 25 percent IPv6, and Belgium still leading the world at almost 43 percent. Many other countries in Europe and Latin America and even Canada have turned green in the past year or two, but a lot of others are still stubbornly staying white, with IPv6 deployment figures well below one percent. Some, including China and many African nations, are even turning red or orange, indicating that IPv6 users in those countries experience significantly worse performance than IPv4 users.

Enlarge/ Per-country IPv6 adoption as measured by Google.

The past four years, IPv6 deployment increased by a factor 2.5 each year: from 0.4 percent by the end of 2011 to 1 percent in late 2012, 2.5 percent at the end of 2013, and 6 percent a year ago. Having 4 percent of the world's population Google's users gain IPv6 connectivity in a year is a huge number, but considering that outside Africa, there's no more IPv4 addresses to be had, the remaining 90 percent IPv4 users are still in for a rough ride.

Of course existing IPv4 addresses aren't going anywhere, but without a steady supply of fresh IP addresses, it's hard to keep the Internet growing in the manner it's accustomed to. For instance, when moving to a new place, you may discover your new ISP can't give you your own IPv4 address anymore, and puts you behind a Carrier Grade NAT (CGN) that makes you share one with the rest of the neighborhood. Some applications, especially ones that want to receive incoming connections, such as video chat, may not work well through certain CGNs.

If a 67 percent increase per year is the new normal, it'll take until summer 2020 until the entire world has IPv6 and we can all stop slicing and dicing our diminishing stashes of IPv4 addresses.

Why the delay?

IPv6 has been around for two decades now. So why are we still measuring IPv6 deployment in percentage points per year, while Wi-Fi didn't exist yet in 1995 and has used up the entire alphabet to label its variants since then, with many users on IEEE 802.11ac today?

In order for a Web browser to show a video of a cat riding a roomba, a bunch of video data needs to be transferred. But that's the easy part: we have more than enough cabling and radio transmitters in place to make that happen. The hard part is knowing what the light pulses and voltages in these cables and the radio waves going through the air actually mean so the receiver can make heads or tails out of it. For that, we have protocols and standards.

Most of the time, when a standard gets updated or replaced, only two devices need to care. For instance, with the explosive growth of the cat on roomba category, Youtube may have needed to upgrade its 10 Gigabit Ethernet connections to 100 Gigabit Ethernet. That's a serious upgrade of the affected servers and routers, but the rest of the internet doesn't care. We've added four zeros to the Ethernet and Wi-Fi speeds over the past decades. But Ethernet, FDDI, Wi-Fi, PPP, Packet over SONET, Resilient Packet Ring and other standards just add their own control data when a packet is transmitted and then remove it after the packet is received at the other end of the line (or radio link). So I have no way of knowing whether a packet I receive from Youtube was transmitted over 10 Gigabit Ethernet or 100 Gigabit Ethernet or something else completely: when I get it, the packet looks the same in each case. This makes upgrading the bandwidth of the internet very easy: just upgrade one connection at a time.

Other standards govern the interpretation of the data by the ultimate receiver. When Youtube upgraded its videos from Adobe Flash to HTML5, obviously the software on Youtube's end had to be upgraded as well as our browsers so those knew how to properly request the videos using HTML5 and then play them as intended. However, the routers along the way don't care. They just see lots of small data packets coming by. That the content of those packets is now formatted slightly differently doesn't influence the way the data is carried through the wires and sent in the right direction. So upgrading applications that run over the internet is harder than increasing the bandwidth because there will always be people visiting Youtube with old browsers. But it's still relatively straightforward: deploy the new standard, but fall back to the old one if necessary. This way, we were able to move from postage stamp sized Sorenson Video-compressed videos to 4k H.264 in about a decade and a half.

Unfortunately, the Internet Protocol is different. The sending system needs to create an IP packet. Then, all the routers along the way (and any firewalls and load balancers) need to look at that IP packet to be able to send it on its way. Finally, the receiving system needs to be able to understand the IP packet to get at the information contained in it. Even worse, the applications on the sending and receiving ends often have to look at IP addresses and thus know the difference between an IPv4 address and an IPv6 address. So we can't just upgrade a server and a client application, or two systems on opposite ends of a cable. We need to upgrade all servers, all clients, all routers, all firewalls, all load balancers, and all management systems to IPv6 before we can retire IPv4 and thus free ourselves of its limitations.

So even though all our operating systems and nearly all network equipment supports IPv6 today (and has for many years in most cases), as long as there's just one device along the way that doesn't understand the new protocol—or its administrator hasn't gotten around to enabling it—we have to keep using IPv4. In that light, having ten percent of users communicate with Google over IPv6 isn't such a bad result.

Listing image by Google

↧

The Surreal Story of StubHub Screwing Over a Kobe Fan

January 7, 2016, 4:09 am

≫ Next: Three Years as a One-Man Startup

≪ Previous: IPv6 celebrates its 20th birthday by reaching 10% deployment

unnamed

January 6, 2016

As readers of TheLead know, we love nothing more than some good ol’ corporate muckraking, so when a reader named Jesse Sandler emailed us with a story that sounded too bizarre to be true – namely that he brilliantly bought four tickets to the Lakers’ last home game (in April against the Jazz) 18 days before Kobe announced his retirement, and a little over a month later, when the price of the tickets had appreciated almost 1000%, StubHub voided the purchase because he bought the tickets for too low of a price – we were excited for the possibility to highlight and (hopefully) rewrite the injustice.

After a little back and forth, Jesse sent us an incredible email that recounted the entire story in his own words and provided all the evidence we could ever ask for: his entire email chain with StubHub customer service, ticket receipts, a semi-incriminating voicemail from a FlubHub supervisor, you name it.

Below, Jesse details his fiasco, and convinces us to never, ever, ever purchase tickets on StubHub again. At the end of this piece (if you’re compelled to do so), we ask that you share Jesse’s story on social media, and force StubChub to issue a statement and honor Mr. Sandler’s original purchase. What they did is complete bullshit, and the online ticketing giant deserves a PR nightmare…

Without further ado, Jesse’s story:

ON NOVEMBER 11, 2015 (18 DAYS BEFORE KOBE ANNOUNCED HIS RETIREMENT), I BOUGHT 4 TICKETS TO THE APRIL 13, 2016 LAKERS—JAZZ GAME WITH THE THOUGHT THAT IT MIGHT BE KOBE’S LAST GAME IN THE NBA. THE TOTAL FOR THE FOUR TICKETS (PLUS TAXES AND FEES) WAS $906.77. AFTER I BOUGHT THE TICKETS, I SENT THIS EMAIL TO MY FRIENDS:

On Nov 11, 2015, at 5:05 PM, Jesse Sandler (email removed) wrote:

Hello gents,

I just bought 4 tix for the Lakers-Jazz game on April 13, 2016. It could possibly be Kobe’s last game at home, so I wanted to go and see that shit. You guys interested?

CLEARLY, MY FRIENDS ALL BEING ENORMOUS LAKERS/KOBE FANS SAID “YES!” HERE’S A SCREENSHOT OF THE CONFIRMATION AND THE CONFIRMATION NUMBER:

pastorders

WE WERE PUMPED! THEN, ON NOVEMBER 29, KOBE ANNOUNCED HIS RETIREMENT AND COMPARABLE SEATS IN SECTION 106 WENT FROM $195 A TICKET TO NEARLY $1500 A TICKET:

PRETTY SOLID INVESTMENT, RIGHT? THE TICKETS I PAID $900 FOR WERE NOW RETAILING FOR ABOUT $6,000. EVERYONE WAS TELLING ME TO FLIP THEM AND TAKE THE CASH, BUT I HAD ZERO DESIRE TO DO SO. I’VE ALWAYS BEEN A HUGE LAKERS FAN AND A HUGE KOBE FAN…WATCHING KOBE PLAY BASKETBALL HAS BEEN A HUGE PART OF MY LIFE SINCE THE 6^TH GRADE, AND I WASN’T ABOUT TO MISS HISTORY…

THAT’S WHEN EVERYTHING FELL APART AND STUBHUB SCREWED ME OVER IN A MANNER I STILL FIND INCOMPREHENSIBLE. A LITTLE OVER TWO WEEKS AFTER KOBE MADE HIS ANNOUNCEMENT, AND OVER A MONTH SINCE I HAD BOUGHT AND BECAME THE RIGHTFUL OWNER OF MY TICKETS, I RECEIVED THIS TYPO-FILLED EMAIL FROM THE WORLD’S WORST TICKETING COMPANY:

1StubHubEmail

UMMMM…WHAT???? I GAVE THE SELLER AMERICAN CURRENCY, HE OR SHE GAVE ME THE LAKERS TICKETS, END OF TRANSACTION. AND NOW STUBHUB’S TELLING ME THAT THEY ARE TAKING AWAY THE TICKETS OVER A MONTH LATER BECAUSE…THE PRICE WENT UP? WHAT KIND OF MARKETPLACE IS THIS?! AFTER AN HOUR+ OF CONVULSING IN ANGER, I SENT STUBHUB THIS EMAIL (IT JUST CONVEYS MY FRUSTRATION, SO FEEL FREE TO SKIP OVER):

2stubhubemail

A FEW HOURS LATER, THE A**HOLES AT STUBHUB SENT THIS:

THE NEXT DAY, THEY *PERSONALLY* REPLIED TO MY EMAIL WITH THIS:

4StubHubEmail

I LOVE HOW THEY END THE EMAIL: “THANKS FOR USING STUBHUB!” IF THERE’S ANYTHING I CAN CONVEY AFTER THIS EXPERIENCE, IT’S THAT IF THERE’S A BEATLES CONCERT AT THE HOLLYWOOD BOWL WHERE THEY BRING BACK GEORGE AND JOHN FROM THE DEAD AND OPEN WITH TUPAC, I’M STILL NOT BUYING THOSE TICKETS IF THEY’RE ONLY ON STUBHUB.

I CALLED FLUBHUB’S CUSTOMER SERVICE AND WAS GIVEN THE RUN-AROUND YOU’D EXPECT. I SPOKE WITH MULTIPLE REPRESENTATIVES AND NOBODY COULD DO ANYTHING TO HELP – ONE PERSON SAID, “OUR HANDS ARE TIED.” THEY WERE APOLOGETIC, AND THEN GAVE ME ANOTHER $100 COUPON (GEE, THANKS), BUT IN THE END I WAS NOWHERE REMOTELY CLOSE TO A SATISFACTORY SOLUTION..

SO I CALLED AND CALLED AND WAITED ON HOLD WITH THE STUBHUB CUSTOMER SERVICE LINE FOR HOURS. I ENDED UP TALKING WITH A WOMAN WHO SAID THAT I COULD SPEAK WITH A “HIGHER UP,” SO I WAITED SOME MORE UNTIL THAT WOMAN RETURNED TO TELL ME THAT I’D BE GETTING A CALL BACK WITHIN 24 HOURS. DURING ONE OF THESE CALLS, THEY THREW IN ANOTHER $50 STUBHUB COUPON. ONCE AGAIN, THEY SAID THAT’S ALL THEY COULD DO.

ON THE EVENING OF THURSDAY, DECEMBER 17, I GOT THE VOICEMAIL I SENT [TO THELEADSPORTS] FROM A REP AT STUBHUB.

*NOTE: At Jesse’s request, we didn’t include the voicemail because the representative (who states his name/title), was the only StubHub employee who wasn’t a complete dick…

ON FRIDAY, DECEMBER 18, 2015 I CALLED BACK AND TALKED TO ANOTHER “SUPERVISOR” GUY AND ASKED IF THERE WAS ANYTHING MORE THAT THEY COULD DO, BECAUSE $906.77 IN STUBHUB CREDIT (PLUS THE $250 THEY GAVE ME AS A RESULT OF MY PREVIOUS PERSISTENCE) DIDN’T EVEN AMOUNT TO ONE OF MY KOBE’S RETIREMENT TICKETS. NOT SURPRISINGLY, THE GUY SAID HIS “HANDS WERE TIED” AND THERE WAS “NOTHING MORE HE COULD DO.”

SO IN THE END, I HAD ABOUT $1100 IN CREDIT FOR SERVICES FROM A WEBSITE I NOW DEEMED CORRUPT, STILL $400 SHORT OF THE COST OF ONE OF MY PREVIOUS TICKETS. OH, AND I CAN’T AFFORD TO GO TO THE GAME THAT I COULDN’T HAVE BEEN MORE F***ING EXCITED ABOUT BECAUSE I BOUGHT THE TICKETS AT THE EXACT RIGHT TIME.

AFTER CONTINUING TO BOMBARD THE SUPERVISOR WITH “HOW ON EARTH CAN THIS HAPPEN?” QUESTIONS, THE TRUTH WAS REVEALED: HE SAID THAT TYPICALLY STUBHUB CHARGES THE SELLER 20% OF THE ORIGINAL SALE PRICE OF THE TICKETS FOR NOT FULFILLING AN ORDER, AND THAT THE SELLER “COULD” BE BANNED FROM SELLING ON THEIR SITE FOR A “NUMBER OF MONTHS.” IN OTHER WORDS, STUBHUB’S OFFICIAL POLICY INCENTIVIZES SELLERS TO RENEGE WHEN PRICES GO UP MORE THAN 20%.

IN MY CASE, WHERE THE TICKETS APPRECIATED ALMOST 1000% AFTER I BOUGHT THEM, IT’S IN THE SELLER’S BEST INTEREST TO COP OUT. THIS IS A SYSTEM THAT ALLOWS AN ARDENT KOBE FAN TO BUY FOUR TICKETS FOR HIM AND HIS FRIENDS TO THE MAMBA’S FINAL HURRAH, ONLY TO HAVE THOSE TICKETS DISAPPEAR OVER A MONTH LATER DESPITE A SUBSTANTIAL ($906.77) FINANCIAL COMMITMENT. IT ALSO MEANS THAT NO TICKETS YOU EVER BUY ON STUBHUB – EVER –ARE ACTUALLY YOUR TICKETS. THE SELLER CAN JUST CHANGE HIS MIND AT ANY TIME AND TAKE THEM BACK. THIS MAKES NO SENSE AND REVEALS THEIR ENTIRE OPERATION AS A COMPLETE SHAM.

I’M SURE OTHERS OUT THERE HAVE SIMILAR STORIES, AND I WANT YOU TO KNOW…YOU’RE NOT ALONE. I HAD THE INCREDIBLE, ONCE-IN-A-LIFETIME OPPORTUNITY TO WATCH MY FAVORITE ATHLETE SAY GOODBYE TO THE GAME HE’S MADE SO INCREDIBLY SPECIAL FOR ME – AND ALL LAKERS FANS – FOR 20 HISTORIC SEASONS, AND I’LL NEVER GET TO EXPERIENCE THAT, AN EVENT I PAID FOR WITH MY OWN, HARD-EARNED MONEY, BECAUSE STUBHUB IS AN ABSOLUTE JOKE OF A COMPANY. ALL I’M ASKING IS THAT IF THIS STORY RESONATES WITH YOU, PLEASE SHARE IT WITH FRIENDS. THIS SHOULD NOT BE ALLOWED TO HAPPEN IN THE UNITED STATES OF AMERICA. AND, KOBE: I’LL BE SITTING ON MY COUCH, SHOUTING LOUDLY, WEARING A #24 JERSEY OVER A #8 JERSEY, WHILE SOFTLY DABBING MY TEARS WITH A #33 LOWER MERION JERSEY. IT’S BEEN A PLEASURE, MY FRIEND. YOU’RE THE MAN.

There you have it. During the hours he spent dealing with the online ticketing giant, a StubHub employee told Jesse “this same thing” happened when Derek Jeter announced his retirement – an obvious testament to the fact they knew there was a hole in the system, and did nothing to fix it.

If you’re a Lakers fan, or just a fan of “standing up to the man,” share Jesse’s story with your friends, and force StubHub to answer for their corporate bamboozlement…

Want to be first to get stories like this? Sign-up now for free to our modern sports page. Delivered directly to your inbox each morning

↧

Three Years as a One-Man Startup

January 5, 2016, 2:00 am

≫ Next: Quiet naval hero who rescued Enigma machine dies aged 95

≪ Previous: The Surreal Story of StubHub Screwing Over a Kobe Fan

3 Years as a One Man Startup

I’ve spent most of the past 3 years creating one language learning web-app, Readlang.

I wrote about my struggle to get this off the ground almost two years ago and was thrilled with the response on Hacker News and elsewhere. I’ve been meaning to write a followup for a long time, but would always convince myself to wait…

Just a couple more tweaks and usage will explode. Then I’ll have something to write about!

Well here I am, three years later. Usage didn’t explode, but grew in fits and starts. I’ve worked hard for 3 years and am still making less than minimum wage. But that’s not as bad as it might sound.

Money

To survive as a bootstrapper, revenue is essential. But it’s often taboo, despite being very useful information. The problem is that if you report a low revenue, you may not be taken seriously. If you earn a lot, people listen but it invites competition and jealousy. I’m still on the low side, so here goes…

New signups grew from 2,800 in year 1, to 9,300 joining in year 2, and 36,800 joining in year 3. Revenue was roughly $700 in year 1, $4100 in year 2, and $16500 in year 3. Expenses are low so the profit in year 3 was about $14500. That’s roughly equivalent to £9,700 British pounds. I’d earn more working 28 hours a week at minimum wage. I worked a lot more than that, so as an experienced software developer have paid a high opportunity cost.

Will I continue with Readlang?

I regularly question my decision to continue pouring so much time into Readlang. I wonder about the lucrative life of a contractor, or the cushy job of software developer at a large tech company. I wonder if I’m hurting my chances of future employment by working so long on my own.

On the other hand, profits have grown 480% over the past year. If this continues, the future looks good. In year 4 (2016) I would make a typical UK software developer salary, and by the end of year 5 I’d be financially rewarded for the risk I’ve taken compared to being employed. But that’s still two years away. And is it even realistic to expect the trend to continue? I don’t know, but I’m making a bet that while I continue to work on it, the answer is yes.

There are easier ways of making a living. But I’m proud of what I’m making, and it seems to genuinely help people to learn languages. Here are a couple of the many quotes I’ve received by email within the past month:

En primeras palabras quiero decir, que me gusta muchísimo tu pagina. Es de verdad grande trabajo.

If your Spanish isn’t great, try reading the above quote on Readlang.

I’m in a polyglot group, and we all try different language tools constantly (Memrise, Anki, FluentU, etc), but I think Readlang has been the “stickiest” for the majority of us.

On top of that, prolific language learner Alex Rawlings recently wrote of Readlang:

This simple tool has changed the way I learn languages forever.

How to read in a foreign language - RawLangs
Feeling fluent feels great. You can go anywhere, say anything, do everything and live your life exactly as you would…rawlangs.com

Feedback like this assures me that I’m doing something right.

Readlang is ramen profitable, helping more people every day, and there’s still plenty of room for improvement. Of course I’m continuing.

Follow my progress

I plan to write more about creating Readlang. If you’d like to ask a question, suggest a topic, or hear updates, please find me on twitter @Steve_Ridout.

↧

Quiet naval hero who rescued Enigma machine dies aged 95

January 6, 2016, 12:12 pm

≫ Next: Postgres 9.5 Release Notes

≪ Previous: Three Years as a One-Man Startup

While the Germans jumped into the Atlantic, a 20-year-old David Balme and small team of sailors climbed into a rowing boat with simple instructions: Get what you can out of her.

Balme, who’d been in the Navy for seven years, could not believe the Germans “would have just abandoned this submarine” and was convinced U110 was either booby-trapped, or armed crewman were still on board, lying in wait.

Instead, the boarders found U110 deserted. Telegraphist Allen Long quickly located the coding device which looked like a typewriter. Long “pressed the keys and. finding results peculiar, sent it up the hatch”.

Balme’s party spent six hours salvaging what they could from U110, all the time compressed air hissed from broken pipes and the boat shook under the distant detonations of depth charges being dropped as the convoy escorts harried other suspected German submarines.

Bulldog tried to tow the crippled U110 to Iceland, but she foundered the following day. The destroyer continued on to Scapa Flow in the Orkneys, the RN’s main base in both world wars, where the ‘typewriter’ was handed over to an intelligence officer. “We have waited the whole war for one of these,” he gratefully thanked Balme and his shipmates.

The salvage operation – codenamed Primrose – was, the Admiralty ordered, “to be treated with the greatest secrecy and as few people allowed to know as possible.”

And so when George VI presented David Balme with the DSC for his part in the mission later in 1941, the monarch apologised that “for security reasons” the award could not be higher.

But he did tell the junior officer it was “perhaps the most important single event in the whole war at sea.”

The Enigma machine and accompanying codebook ended up at Bletchley Park, where they would be exploited by maths genius Alan Turing and his colleagues, allowing some German radio traffic to be read by British intelligence.

The story of the seizure of the machine by Balme and his shipmates was kept secret until the mid-1970s and ‘Hollywoodised’ in 2000 in the blockbuster U571; the fictionalised account has American submariners, not British destroyermen, rescuing Enigma from a crippled German boat.

David Balme’s career in the RN after Bulldog/U110 was no less dramatic; he commanded a detachment of gunners protecting a merchant ship on the Malta convoys (which was sunk), transferred to the Fleet Air Arm as an observer and flew missions in the Mediterranean; and was the youngest lieutenant commander in the RN when promoted to that rank.

After the war he worked in the family wool business in Hampshire.

↧

Postgres 9.5 Release Notes

January 7, 2016, 5:51 am

≫ Next: PostgreSQL 9.5: UPSERT, Row Level Security, and Big Data

≪ Previous: Quiet naval hero who rescued Enigma machine dies aged 95

A dump/restore using pg_dumpall, or use of pg_upgrade, is required for those wishing to migrate data from any previous release.

Version 9.5 contains a number of changes that may affect compatibility with previous releases. Observe the following incompatibilities:

Adjust operator precedence to match the SQL standard (Tom Lane)

The precedence of <=, >= and <> has been reduced to match that of <, > and =. The precedence of IS tests (e.g., xIS NULL) has been reduced to be just below these six comparison operators. Also, multi-keyword operators beginning with NOT now have the precedence of their base operator (for example, NOT BETWEEN now has the same precedence as BETWEEN) whereas before they had inconsistent precedence, behaving like NOT with respect to their left operand but like their base operator with respect to their right operand. The new configuration parameter operator_precedence_warning can be enabled to warn about queries in which these precedence changes result in different parsing choices.

Change pg_ctl's default shutdown mode from smart to fast (Bruce Momjian)

This means the default behavior will be to forcibly cancel existing database sessions, not simply wait for them to exit.

Use assignment cast behavior for data type conversions in PL/pgSQL assignments, rather than converting to and from text (Tom Lane)

This change causes conversions of Booleans to strings to produce true or false, not t or f. Other type conversions may succeed in more cases than before; for example, assigning a numeric value 3.9 to an integer variable will now assign 4 rather than failing. If no assignment-grade cast is defined for the particular source and destination types, PL/pgSQL will fall back to its old I/O conversion behavior.

Allow characters in server command-line options to be escaped with a backslash (Andres Freund)

Formerly, spaces in the options string always separated options, so there was no way to include a space in an option value. Including a backslash in an option value now requires writing \\.

Change the default value of the GSSAPI include_realm parameter to 1, so that by default the realm is not removed from a GSS or SSPI principal name (Stephen Frost)

Replace configuration parameter checkpoint_segments with min_wal_size and max_wal_size (Heikki Linnakangas)

If you previously adjusted checkpoint_segments, the following formula will give you an approximately equivalent setting:

max_wal_size = (3 * checkpoint_segments) * 16MB

Note that the default setting for max_wal_size is much higher than the default checkpoint_segments used to be, so adjusting it might no longer be necessary.

Control the Linux OOM killer via new environment variables PG_OOM_ADJUST_FILE and PG_OOM_ADJUST_VALUE, instead of compile-time options LINUX_OOM_SCORE_ADJ and LINUX_OOM_ADJ (Gurjeet Singh)

Decommission server configuration parameter ssl_renegotiation_limit, which was deprecated in earlier releases (Andres Freund)

While SSL renegotiation is a good idea in theory, it has caused enough bugs to be considered a net negative in practice, and it is due to be removed from future versions of the relevant standards. We have therefore removed support for it from PostgreSQL. The ssl_renegotiation_limit parameter still exists, but cannot be set to anything but zero (disabled). It's not documented anymore, either.

Remove server configuration parameter autocommit, which was already deprecated and non-operational (Tom Lane)

Remove the pg_authid catalog's rolcatupdate field, as it had no usefulness (Adam Brightwell)

The pg_stat_replication system view's sent field is now NULL, not zero, when it has no valid value (Magnus Hagander)

Allow json and jsonb array extraction operators to accept negative subscripts, which count from the end of JSON arrays (Peter Geoghegan, Andrew Dunstan)

Previously, these operators returned NULL for negative subscripts.

↧

PostgreSQL 9.5: UPSERT, Row Level Security, and Big Data

January 7, 2016, 5:50 am

≫ Next: Chirimen, a Firefox OS-Powered IoT Single-Board Computer Developed by Mozilla

≪ Previous: Postgres 9.5 Release Notes

Posted on Jan. 7, 2016

7 JANUARY 2016: The PostgreSQL Global Development Group announces the release of PostgreSQL 9.5. This release adds UPSERT capability, Row Level Security, and multiple Big Data features, which will broaden the user base for the world's most advanced database. With these new capabilities, PostgreSQL will be the best choice for even more applications for startups, large corporations, and government agencies.

Annie Prévot, CIO of the CNAF, the French Child Benefits Office, said, "The CNAF is providing services for 11 million persons and distributing 73 billion Euros every year, through 26 types of social benefit schemes. This service is essential to the population and it relies on an information system that must be absolutely efficient and reliable. The CNAF's information system is satisfyingly based on the PostgreSQL database management system."

UPSERT

A most-requested feature by application developers for several years, "UPSERT" is shorthand for "INSERT, ON CONFLICT UPDATE", allowing new and updated rows to be treated the same. UPSERT simplifies web and mobile application development by enabling the database to handle conflicts between concurrent data changes. This feature also removes the last significant barrier to migrating legacy MySQL applications to PostgreSQL.

Developed over the last two years by Heroku programmer Peter Geoghegan, PostgreSQL's implementation of UPSERT is significantly more flexible and powerful than those offered by other relational databases. The new ON CONFLICT clause permits ignoring the new data, or updating different columns or relations in ways which will support complex ETL (Extract, Transform, Load) toolchains for bulk data loading. And, like all of PostgreSQL, it is designed to be absolutely concurrency-safe and to integrate with all other PostgreSQL features, including Logical Replication.

Row Level Security

PostgreSQL continues to expand database security capabilities with its new Row Level Security (RLS) feature. RLS implements true per-row and per-column data access control which integrates with external label-based security stacks such as SE Linux. PostgreSQL is already known as "the most secure by default." RLS cements its position as the best choice for applications with strong data security requirements, such as compliance with PCI, the European Data Protection Directive, and healthcare data protection standards.

RLS is the culmination of five years of security features added to PostgreSQL, including extensive work by KaiGai Kohei of NEC, Stephen Frost of Crunchy Data, and Dean Rasheed. Through it, database administrators can set security "policies" which filter which rows particular users are allowed to update or view. Data security implemented this way is resistant to SQL injection exploits and other application-level security holes.

Big Data Features

PostgreSQL 9.5 includes multiple new features for bigger databases, and for integrating with other Big Data systems. These features ensure that PostgreSQL continues to have a strong role in the rapidly growing open source Big Data marketplace. Among them are:

BRIN Indexing: This new type of index supports creating tiny, but effective indexes for very large, "naturally ordered" tables. For example, tables containing logging data with billions of rows could be indexed and searched in 5% of the time required by standard BTree indexes.

Faster Sorts: PostgreSQL now sorts text and NUMERIC data faster, using an algorithm called "abbreviated keys". This makes some queries which need to sort large amounts of data 2X to 12X faster, and can speed up index creation by 20X.

CUBE, ROLLUP and GROUPING SETS: These new standard SQL clauses let users produce reports with multiple levels of summarization in one query instead of requiring several. CUBE will also enable tightly integrating PostgreSQL with more Online Analytic Processing (OLAP) reporting tools such as Tableau.

Foreign Data Wrappers (FDWs): These already allow using PostgreSQL as a query engine for other Big Data systems such as Hadoop and Cassandra. Version 9.5 adds IMPORT FOREIGN SCHEMA and JOIN pushdown making query connections to external databases both easier to set up and more efficient.

TABLESAMPLE: This SQL clause allows grabbing a quick statistical sample of huge tables, without the need for expensive sorting.

"The new BRIN index in PostgreSQL 9.5 is a powerful new feature which enables PostgreSQL to manage and index volumes of data that were impractical or impossible in the past. It allows scalability of data and performance beyond what was considered previously attainable with traditional relational databases and makes PostgreSQL a perfect solution for Big Data analytics," said Boyan Botev, Lead Database Administrator, Premier, Inc.

Contact

PostgreSQL Press Team press@postgresql.org Phone: +1 (347) 674-7759

About PostgreSQL

PostgreSQL is the world's most advanced database system, with a global community of thousands of users and contributors and dozens of companies and organizations. The PostgreSQL Project builds on over 25 years of engineering, starting at the University of California, Berkeley, and has an unmatched pace of development today. PostgreSQL's mature feature set not only matches top proprietary database systems, but exceeds them in advanced database features, extensibility, security and stability. Learn more about PostgreSQL and participate in our community at: http://www.postgresql.org.

↧

what the math looks like (it’s just multiplication)

time to quit

my actual situation

Questions to ask

#talkpay

Do the math

1. One Sample t-Test

Why is it used?

How to interpret?

2. Wilcoxon Signed Rank Test

Why / When is it used?

How to interpret?

3. Two Sample t-Test and Wilcoxon Rank Sum Test

How to implement in R?

What if we want to do a 1-to-1 comparison of means for values of x and y?

When can I conclude if the mean’s are different?

4. Shapiro Test

Why is it used?

How to interpret?

How to interpret?

5. Kolmogorov And Smirnov Test

How to tell if they are from the same distribution ?

6. Fisher’s F-Test

7. Chi Squared Test

How to tell if x, y are independent?

8. Correlation

Why is it used?

How to interpret?

9. More Commonly Used Tests

Introducing Guesstimate, a Spreadsheet for Things That Aren’t Certain

Previous Posts

Topics

Messages Between Server & Clients

Configure Messages

Query Messages

Commit Messages

Game World Management

Actor Class

Conclusion

Clojure in the World

Language, Tools, and Libraries

Books and Docs

Events and Community

Summary

Contents

I wonder what my site looks like when I'm anonymous.[edit]

Login into your real life Facebook account and think you are anonymous.[edit]

Never login into accounts you ever used without Tor.[edit]

Don't login into your bank account, paypal, ebay or other important personal accounts unless...[edit]

Don't alternate Tor with open WiFi.[edit]

Prevent Tor over Tor scenarios.[edit]

Don't send sensitive data without end-to-end encryption.[edit]

Don't disclose identifying data about yourself.[edit]

Do use bridges if you think Tor usage is dangerous/suspicious in your country.[edit]

Don't use different online identities at the same time.[edit]

Do not mix Modes of Anonymity![edit]

mode(1): user anonymous; any recipient[edit]

mode(2): user knows recipient; both use Tor[edit]

mode(3): user with no anonymity using Tor; any recipient[edit]

mode(4): user with no anonymity; any recipient[edit]

Conclusion[edit]

License[edit]

Don't change settings if you don't know their consequences.[edit]

Do not use clearnet and Tor at the same time.[edit]

Do not connect to any server anonymously and non-anonymously at the same time![edit]

Do not confuse Anonymity with Pseudonymity.[edit]

Don't be the first one to spread your own link.[edit]

Don't open random files or links.[edit]

Don't do (mobile) phone verification.[edit]

Why this page?[edit]

Attribution[edit]

Related Coverage

San CristÃ³bal De Rapaz Journal: High in the Andes, Guardians of an Inca MysteryAUG. 16, 2010

Introduction

Programming and the Programmers’ Knowledge

Ryle’s Notion of Theory

The Theory To Be Built by the Programmer

Problems and Costs of Program Modifications

Program Life, Death, and Revival

Method and Theory Building