Hacker News
new
|
past
|
comments
|
ask
|
show
|
jobs
|
submit
login
brianchu
on Nov 2, 2015
|
parent
|
context
|
favorite
| on:
Apache Singa, a Distributed Deep Learning Platform
According to the paper, the parameters fit on one GPU (or at least that one GPU was able to train the model). It was just too slow, so they trained on 8 GPUs in parallel. But those GPUs were still on the same machine (one node, multiple GPUs).
Guidelines
|
FAQ
|
Lists
|
API
|
Security
|
Legal
|
Apply to YC
|
Contact
Search: