Hacker Newsnew | past | comments | ask | show | jobs | submit | eelsen's commentslogin

no


This is clearly an amazing resource on advanced GPU programming regardless of programming language.


Try programming in both CUDA and OpenCL and see which one you would choose.


The claim of a 700,000x speedup makes me suspicious of pretty much everything else about the work.


The question is what kind of processing he did. Usually database systems are heavily constraint by the I/O bandwidth of the system - even the cheapest consumer CPU is able to process data orders of magnitude faster than discs are able to deliver new data.

That is why database servers usually have the fastest discs money can buy, as much memory as you can fit in there in order to keep as much data as possible in main memory and the largest possible caches to avoid hitting the relatively slow memory bus as often as possible. The situation becomes much worse if the database is not read-only because all changes have to be persisted and you have to hit the disc on every change.

Therefore I think using GPUs will buy you nothing for common database application - processor speed was never your problem. But this does not exclude the possibility that there are operation, for example correlating millions of data points from your data set, where the processing power of GPUs comes handy, but this is most likely not what you will see in one of your line of business applications.

The number 700,000 is probably the result of a highly tuned GPU implementation versus a general purpose CPU implementation. A quad-core Core i7 at 3.0 GHz has a theoretical peak performance of 96 GFLOPs [1], the currently fastest supercomputer peaks at 27,112,500 GFLOPS (Titan, 560,640 cores, 8.2 MW [2]). 700,000 times the mentioned Core i7 is still more than twice the performance of Titan, 67,200,000 GFLOPs.

[1] http://stackoverflow.com/questions/15655835/flops-per-cycle-... [2] http://www.top500.org/lists/2012/11/


It appears from the graphs (yeah, I know) that he's probably doing trig (to locate the tweet) and collision detection (to determine the area of the city). He could also have more control over single precision algorithm choice than he would have via his original solution. It could be something as simple as pre-calculating the geolocation with a parallel CUDA kernel during ETL into PostGIS and bypassing its index generation.

I also have an itchy feeling that there may have been some confusion regarding conversions between "times" and "percent", possibly multiple times. [1]

Extrapolating that an iterative solution would take 40 days, without running it for 40 days, is tricky as well. Given multiple tweets in close proximity, PostGIS should benefit from caching the R-tree index.

I'm not assuming the inventor is doing these things, as his db prof was impressed, as well as CSAIL and the body awarding his prize. [2] I think, if anything, the reporting is intended to be high-level enough to have more broad appeal, which makes it more difficult for us to evaluate the results in a Gedankenversuch. "Reduce slowness here..."

[1] The sidebar says,'Some statistical algorithms run 70 times faster compared to CPU-based systems like MapReduce,' while the quote in the article says,'“The speed ups over PostGIS … I’m not an expert, and I’m sure I could have been more efficient in setting up the system in the first place, but it was 700,000 times faster,” Mostak said. “Something that would take 40 days was done in less than a second.”'

[2] Hey, logical fallacies do not necessarily mean the argument is wrong!


This is what I was wondering as well. I work on a BigData BI solution, and in my experience IO and not processing time is the final bottleneck. I would expect working on the GPU to be slower, not faster, than working on CPU, because of the extra time needed to copy the data/results into/out of the VRAM. Also 40M tweets is what I would consider the very bottom of the BigData scale. I would guess the use-case here is one where the calculation is very complex, much more so than the data itself.


Here's a technical talk: http://www.youtube.com/watch?v=WSvh5ZPrR4w I couldn't find a paper, but fortunately this video is only 10 minutes.

He's got an in-memory column store, so (AFAIK) those are immediate advantages over PostGIS. He's also using pre-rendering of shapes into bitmaps (a form of caching/indexing) along with GPU hardware to accelerate intersection queries. It looks like this is a case where a problem (2D geo data) maps quite naturally to GPU hardware (2D textures) if only you can discover the mapping.


Some really nice ideas in there - I especially like the tweet-to-composite-number-via-word-to-prime-mapping-thing.


Not me. PostGIS has done an amazing job given its constraints, but it is pretty obvious that there are better ways to do certain operations. Its not that I believe him...its just that I think its possible enough for the claim to not be unbelievable.


So his comment is more along the lines of, "I was using a screwdriver to pound in this nail and it was taking forever, but then I started using a hammer and man...it went fast!"


Yes...that is how I see it.


Hi - MapD creator - I love Postgres/PostGIS and use them all the time - there are just some things that are going to execute fast in parallel with high-memory bandwidth. And then there is the algorithm - see my post above for an explanation.


It says 70?


And, later: "“The speed ups over PostGIS … I’m not an expert, and I’m sure I could have been more efficient in setting up the system in the first place, but it was 700,000 times faster,” Mostak said. “Something that would take 40 days was done in less than a second.”"


“The speed ups over PostGIS … I’m not an expert, and I’m sure I could have been more efficient in setting up the system in the first place, but it was 700,000 times faster,” Mostak said. “Something that would take 40 days was done in less than a second.”


Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: