![logo webcamp](images/logo-webcamp.png)
Josip Maslać
if you didn’t have the chance to watch me present these slides at the conference fear not!
there should be a video available at: 2016.webcamp.si
you can press letter S and get more or less everything I said
Josip Maslać
average GNU/Linux user
founder & co-owner
wanted: hr_HR ⇒ sl_SL translators
250k unique visitors
450k visits
1 900k pageviews
99% site’s content is dynamic
requirement ⇒ minimal change to the application code
all services run on both servers but just some are "active"
mysql@srv2 is "active" ⇒ everybody connects to that node
change database connections on the fly
our choice High Availability Proxy - HAProxy
free, very fast and reliable solution offering high availability, load balancing, and proxying for TCP and HTTP-based applications
important terms:
high availability
high availability & scaling
eh, so so…
is hard!!
hasn’t anyone done this before?!
U in CRUD is the (biggest) problem
in our case
web apps typically have read/write radio around 80/20
separate write & read operations ⇒ non-trivial task
replication is done in both directions
painfully hard
concurrency issues
we "ignored" the problems by:
using active/passive approach
in a given moment only one database node can be used
not really scaling
but at least we got high availability
for our use case "good enough"
has it’s own solution ⇒ SolrCloud
requires min. of 3 nodes/servers ⇒ no go
our app - minimal effort to separate write & read operations
so we used standard master-slave replication
[again] Hasn’t anyone done this before?!
Manually syncing (rsync/lsync) - problems:
[again] concurrency issues
who should overwrite who
scale-out network-attached storage file system
main use cases:
storing & accessing LARGE amounts of data
ensuring high availability (file replication)
transparent to applications
easy to setup and use
knows the concept of a volume
different types of volumes (distributed, replicated, stripped…)
replicated volume - file replication & synhronization:
when a file is updated Gluster takes care it is updated on all the servers it should be updated (in a synchronous & transactional manner)
replication: synchronous & transactional
no time for details
why not Amazon S3 (or similar)
requires (significant?) updates to the application code
high availability
scaled a bit
almost didn’t touch app code (changed 5~10 lines)
learned a LOT
a BUNCH of stuff
sessions issues <> loadbalancing
server provisioning
containerisation (docker)
and a lot more…
I lied
Contact info
Don’t forget