Database Scalability : Vertical Scaling vs Horizontal Scaling

Every organization must have the ability to handle an increase or decrease in the business demands. So, it is crucial that businesses are equipped with database scalability. Let’s read more about it.

Database Scalability

Scalability means the ability to expand the computer resources to handle the exponential growth   of work. It refers to the system’s capacity to handle an increase in load by increasing the total output when resources are added. Database scalability means the ability of a system’s database to scale up or down as per the requirement. If the database isn’t scalable, then the processes can slow down or even fail which can be quite detrimental to the business operations. Further, it enables the database to grow to a larger size to support more transactions as the volume of business and/or customer count grows.
There are two types of database scalability :

  1. Vertical Scaling or Scale-up
  2. Horizontal Scaling or Scale-out

Let us look at each of them in detail.

Scale-up or Vertical Scaling

It refers to the process of adding more physical resources such as memory, storage and CPU to the existing database server for improving the performance. Vertical scaling helps in upgrading the capacity of the existing database server. It results in a robust system. Some of its pros include:

Pros of Scaling-Up

  • It consumes less power as compared to running multiple servers
  • Administrative efforts will be reduced as you need to handle and manage just one system
  • Cooling costs are lesser than horizontal scaling
  • Reduced software costs
  • Implementation isn’t difficult
  • The licensing costs are less
  • The application compatibility is retained

Cons of Scaling up

  • There is a greater risk of hardware failure which can cause bigger outages
  • Limited scope of upgradeability in the future
  • Severe vendor lock-in
  • The overall cost of implementing is really expensive

Scale-Out or Horizontal Scaling

When you add more servers with less RAM and processors, it is known as horizontal scaling. It can also be defined as the ability to increase the capacity by connecting multiple software or hardware entities in such a manner that they function as a single logical unit. It is cheaper as a whole and it can literally scale infinitely, however, there are some limits imposed by software or other attributes of an environment’s infrastructure. When the servers are clustered, the original server is scaled out horizontally. If a cluster requires more resources to improve its performance and provide high availability, then the administrator can scale-out by adding more servers to the cluster.

Pros of Scaling-out

  • Much cheaper compared to scaling-up
  • Takes advantage of smaller systems
  • Easy to upgrade
  • Resilience is improved due to the presence of discrete, multiple systems
  • Easier to run fault-tolerance
  • Supporting linear increases in capacity

Cons of Scaling-out

  • The licensing fees are more
  • Utility costs such as cooling and electricity are high
  • It has a bigger footprint in the Data Center
  • More networking equipment such as routers and switches may be needed

Scaling out is not a new concept but it has gained momentum as the storage startups create new structures that leverage the power of modern x86 servers. This addresses various limitations of the older scale-up architectures. However, there are some risks of this model. Each of the x86 servers is a failure domain that doesn’t exist in a scale-up environment, which is usually handled by the layout of data on the array, where the copies of data are kept on at least two nodes. Another risk of scaling-out is that upgrading the nodes is quite complex and if you want to roll out new software to a hundred nodes, it becomes quite a tedious task.


 

Horizontal scaling means that you scale by adding more machines into your pool of resources whereas Vertical scaling means that you scale by adding more power (CPU, RAM) to an existing machine.

An easy way remember this is to think of a machine on a server rack, we add more machines across the horizontal direction and add more resources to a machine in the vertical direction.

dbscalability

In a database world horizontal-scaling is often based on partitioning of the data i.e. each node contains only part of the data , in vertical-scaling the data resides on a single node and scaling is done through multi-core i.e. spreading the load between the CPU and RAM resources of that machine.

With horizontal-scaling it is often easier to scale dynamically by adding more machines into the existing pool – Vertical-scaling is often limited to the capacity of a single machine, scaling beyond that capacity often involves downtime and comes with an upper limit.

A good example for horizontal scaling is Cassandra , MongoDB .. and a good example for vertical scaling is MySQL – Amazon RDS (The cloud version of MySQL). It provides an easy way to scale vertically by switching from small to bigger machines. This process often involves downtime.

In-Memory Data Grids such as GigaSpaces XAP, Coherence etc.. are often optimized for both horizontal and vertical scaling simply because they’re not bound to disk. Horizontal-scaling through partitioning and vertical-scaling through multi-core support.

Database scalability helps in eliminating performance bottlenecks. Understand your company’s scalability needs and implement the same. It is critical that you the weigh the pros and cons of both vertical and horizontal scaling before you decide on what to implement. What works for other companies may not work for you. Check the benefits of both the types against your requirements and you implement the right one. Undoubtedly, you will be amazed by the results you achieve.

SQL vs NoSQL: High-Level Differences

  • SQL databases are primarily called as Relational Databases (RDBMS); whereas NoSQL database are primarily called as non-relational or distributed database.
  • SQL databases are table based databases whereas NoSQL databases are document based, key-value pairs, graph databases or wide-column stores. This means that SQL databases represent data in form of tables which consists of n number of rows of data whereas NoSQL databases are the collection of key-value pair, documents, graph databases or wide-column stores which do not have standard schema definitions which it needs to adhered to.
  • SQL databases have predefined schema whereas NoSQL databases have dynamic schema for unstructured data.
  • SQL databases are vertically scalable whereas the NoSQL databases are horizontally scalable. SQL databases are scaled by increasing the horse-power of the hardware. NoSQL databases are scaled by increasing the databases servers in the pool of resources to reduce the load.
  • SQL databases uses SQL ( structured query language ) for defining and manipulating the data, which is very powerful. In NoSQL database, queries are focused on collection of documents. Sometimes it is also called as UnQL (Unstructured Query Language). The syntax of using UnQL varies from database to database.
  • SQL database examples: MySql, Oracle, Sqlite, Postgres and MS-SQL. NoSQL database examples: MongoDB, BigTable, Redis, RavenDb, Cassandra, Hbase, Neo4j and CouchDb
  • For complex queries: SQL databases are good fit for the complex query intensive environment whereas NoSQL databases are not good fit for complex queries. On a high-level, NoSQL don’t have standard interfaces to perform complex queries, and the queries themselves in NoSQL are not as powerful as SQL query language.
  • For the type of data to be stored: SQL databases are not best fit for hierarchical data storage. But, NoSQL database fits better for the hierarchical data storage as it follows the key-value pair way of storing data similar to JSON data. NoSQL database are highly preferred for large data set (i.e for big data). Hbase is an example for this purpose.
  • For scalability: In most typical situations, SQL databases are vertically scalable. You can manage increasing load by increasing the CPU, RAM, SSD, etc, on a single server. On the other hand, NoSQL databases are horizontally scalable. You can just add few more servers easily in your NoSQL database infrastructure to handle the large traffic.
  • For high transactional based application: SQL databases are best fit for heavy duty transactional type applications, as it is more stable and promises the atomicity as well as integrity of the data. While you can use NoSQL for transactions purpose, it is still not comparable and sable enough in high load and for complex transactional applications.
  • For support: Excellent support are available for all SQL database from their vendors. There are also lot of independent consultations who can help you with SQL database for a very large scale deployments. For some NoSQL database you still have to rely on community support, and only limited outside experts are available for you to setup and deploy your large scale NoSQL deployments.
  • For properties: SQL databases emphasizes on ACID properties ( Atomicity, Consistency, Isolation and Durability) whereas the NoSQL database follows the Brewers CAP theorem ( Consistency, Availability and Partition tolerance )
  • For DB types: On a high-level, we can classify SQL databases as either open-source or close-sourced from commercial vendors. NoSQL databases can be classified on the basis of way of storing data as graph databases, key-value store databases, document store databases, column store database and XML databases.

Popular SQL databases and RDBMS’s

  • MySQL—the most popular open-source database, excellent for CMS sites and blogs.
  • Oracle—an object-relational DBMS written in the C++ language. If you have the budget, this is a full-service option with great customer service and reliability. Oracle has also released an Oracle NoSQL database.
  • IMB DB2—a family of database server products from IBM that are built to handle advanced “big data” analytics.
  • Sybase—a relational model database server product for businesses primarily used on the Unix OS, which was the first enterprise-level DBMS for Linux.
  • MS SQL Server—a Microsoft-developed RDBMS for enterprise-level databases that supports both SQL and NoSQL architectures.
  • Microsoft Azure—a cloud computing platform that supports any operating system, and lets you store, compute, and scale data in one place. A recent survey even put it ahead of Amazon Web Services and Google Cloud Storage for corporate data storage.
  • MariaDB—an enhanced, drop-in version of MySQL.
  • PostgreSQL—an enterprise-level, object-relational DBMS that uses procedural languages like Perl and Python, in addition to SQL-level code.

Popular NoSQL Databases

  • MongoDB—the most popular NoSQL system, especially among startups. A document-oriented database with JSON-like documents in dynamic schemas instead of relational tables that’s used on the back end of sites like Craigslist, eBay, Foursquare. It’s open-source, so it’s free, with good customer service.
  • Apache’s CouchDB—a true DB for the web, it uses the JSON data exchange format to store its documents; JavaScript for indexing, combining and transforming documents; and, HTTP for its API.
  • HBase—another Apache project, developed as a part of Hadoop, this open-source, non-relational “column store” NoSQL DB is written in Java, and provides BigTable-like capabilities.
  • Oracle NoSQL—Oracle’s entry into the NoSQL category.
  • Apache’s Cassandra DB—born at Facebook, Cassandra is a distributed database that’s great at handling massive amounts of structured data. Anticipate a growing application? Cassandra is excellent at scaling up. Examples: Instagram, Comcast, Apple, and Spotify.
  • Riak—an open-source key-value store database written in Erlang. It has fault-tolerance replication and automatic data distribution built in for excellent performance.
  • Redis – an open source (BSD licensed), in-memory data structure store, used as a database, cache and message broker. It supports data structures such as strings, hashes, lists, sets, sorted sets with range queries, bitmaps, hyperloglogs and geospatial indexes with radius queries. Redis has built-in replication, Lua scripting, LRU eviction, transactions and different levels of on-disk persistence, and provides high availability via Redis Sentinel and automatic partitioning with Redis Cluster.

What database solution is right for you???

Don’t forget to like & share this post on social networks!!! I will keep on updating this blog. Please do follow!!!

 

 

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s