Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

It's easy if you don't need exact results!\

I didn't mean randomizing the votes. Perhaps I should have written "shuffling." I meant randomizing the order of the stream.



Let me see if I'm misunderstanding something about the problem.

1. The items come in a stream, so you have to accept each one in order.

2. You have to output a stream too.

3. You get a fixed amount of storage, much smaller than the stream.

4. The items are arbitrary and unique, so there is no way to compact n items into significantly less then O(n) storage.

If the stream is large and sorted, you run out of buffer before the input stream gets past 'A'. You're forced to output thousands/millions of entries in a row that start with A. That doesn't look at all random.

It seems impossible if I understand the problem. Is one of my numbered statements wrong? Is there a way to get items out of order? Put items back into the stream? Is there a limited range of items? Getting unique items in order lets me compress the data very slightly, but 25% more storage doesn't fix anything either.


getting something that looks "reasonably random" to casual inspection by human beings, using only O(1) space.

7 (+/- 2) is the magic number. However, you make a good point. This is highly dependent on exactly how large the data is, and how "casual" the human are.


The size of working memory... that sounds like a threshold for taking an already-randomized list and turning it into another one that looks different on casual inspection.

Even with a small size like 200 a shuffler that weak won't do a good job of turning sorted into unsorted, even at a glance.


a shuffler that weak won't do a good job of turning sorted into unsorted, even at a glance.

That depends on how casual the inspection is.


I would expect the most casual inspection to actually be the least affected, since it would be be the most focused on the first letter or two of each entry.


Randomly choosing the votes sequentially with no memory is arbitrarily close to being the same thing as shuffling for sufficiently high numbers of votes.


Except that the votes will come in with names in alphabetical order, which is one of the factors cited in the article which got the students caught.


hmm you can randomise the time at which you next put something into the stream right? I am thinking you can sleep() until the next time, just by doing rand() * someMultiplier * aSecond


How do you do that and wind up with a stochastic O(1) space as opposed to O(n) space? (Remember, O(n/K) is O(n))




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: