Skip navigation
2018

This is part three of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from  chapter 2 of my new book Software Defined Data Infrastructure  Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O  Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In  this post, we continue looking at application and data characteristics with a focus on different types of data. There is more to data than simply being big data, fast data, big fast or unstructured, structured or semistructured, some of which has been touched on in this series, with more to follow. Note that there is also data in terms of the programs, applications, code, rules, policies as well as configuration settings, metadata along with other items stored.

 

Application Data Value Software Defined Data Infrastructure Essentials Book SDDC

 

Various Types of Data

Data types along with characteristics include big data, little data, fast data, and old as well as new data with a different value, life-cycle, volume and velocity.  There are data in files and objects that are big representing images, figures,  text, binary, structured or unstructured that are software defined by the applications that create, modify and use them.

 

There are many different types of data and applications to meet various business,  organization, or functional needs. Keep in mind that applications are based on programs which consist of algorithms and data structures that define the data, how to use it, as well as how and when to store it. Those data  structures define data that will get transformed into information by programs  while also being stored in memory and on data stored in various formats.

 

Just as various applications have different algorithms, they also have different types of data. Even though everything is not the same in all environments, or even how the same applications get used across various organizations, there are some similarities.  Even though there are different types of applications and data, there are also some similarities and general characteristics. Keep in mind that information  is the result of programs (applications and their algorithms) that process data  into something useful or of value.

 

Data  typically has a basic life cycle of:

  • Creation and some activity, including being protected
  • Dormant, followed by either  continued activity or going inactive
  • Disposition (delete or remove)

 

In general, data can be

  • Temporary, ephemeral or transient
  • Dynamic or changing (“hot data”)
  • Active static on-line, near-line,  or off-line (“warm-data”)
  • In-active static on-line or  off-line (“cold data”)

 

Data is organized

  • Structured
  • Semi-structured
  • Unstructured

 

General  data characteristics include:

  • Value = From no value to unknown  to some or high value
  • Volume = Amount of data, files,  objects of a given size
  • Variety = Various types of data (small, big, fast, structured, unstructured)
  • Velocity = Data streams, flows,  rates, load, process, access, active or static

 

The  following figure shows how different data has various values over time. Data  that has no value today or in the future can be  deleted, while data with unknown value can be retained.

Different  data with various values over time

Application Data Value across sddc
Data Value Known, Unknown and No Value

 

General  characteristics include the value of the data which in turn determines its  performance, availability, capacity, and economic  considerations. Also, data can be  ephemeral (temporary) or kept for longer periods of time on persistent,  non-volatile storage (you do not lose the data when power is turned off). Examples of temporary scratch  include work and scratch areas such as where data gets imported into, or  exported out of, an application or database.

 

Data  can also be little, big, or big and fast, terms which describe in part the size  as well as volume along with the speed or velocity of being created, accessed,  and processed. The importance of understanding characteristics of data and how  their associated applications use them is to enable effective decision-making about performance, availability, capacity, and economics of data infrastructure  resources.

Data Value

There  is more to data storage than how much space capacity per cost.

 

All data has one  of three basic values:

  • No value = ephemeral/temp/scratch  = Why keep it?
  • Some value = current or emerging  future value, which can be low or high =  Keep
  • Unknown value = protect until  value is unlocked, or no remaining value

 

In addition to the above basic three, data with some value can also be further subdivided into little value, some value, or high value. Of course, you can keep subdividing into as many more or different categories  as needed, after all, everything is not always  the same across environments.

 

Besides data having some value, that value can also change by increasing or decreasing in value over time or even going from unknown to a known value, known to unknown, or to no value. Data with no value can be discarded, if in doubt, make and keep a copy of that data somewhere safe until its value (or lack of value) is fully known and understood.

 

The importance of understanding the value of data is to enable effective decision-making on where and how to protect, preserve, and cost-effectively store the data. Note that cost-effective does not necessarily mean the cheapest or lowest-cost approach, rather it means the way that aligns with the value and importance of the data at a given point in time.

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software-defined data center (SDDC), software-defined data infrastructures (SDDI)  and related topics via the following links:

 

https://storageioblog.com/data-infrastructure-primer-overview/

SDDC Data Infrastructure

 

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

 

Software Defined Data Infrastructure Essentials Book SDDC

What this all means and wrap-up

Data has different value at various times, and that value is also evolving. Everything Is Not The Same across various  organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. Continue reading the next post (Part IV Application Data Volume Velocity Variety Everything Not The Same) in this series here.

 

Ok, nuff said, for now.

Gs

This is part two of a five-part mini-series looking at Application Data Value Characteristics everything is not the same as a companion excerpt from  chapter 2 of my new book Software Defined Data Infrastructure  Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O  Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In  this post, we continue looking at application performance, availability, capacity, economic (PACE) attributes that have an impact on data value as well as availability.

 

4 3 2 1 data protection  Book SDDC

Availability (Accessibility, Durability, Consistency)

Just as  there are many different aspects and focus areas for performance, there are also several facets to availability. Note that applications  performance requires availability and availability relies  on some level of performance.

 

Availability is a broad and encompassing area that includes data protection to  protect, preserve, and serve (backup/restore, archive, BC, BR, DR, HA) data and  applications. There are logical and physical aspects of availability including  data protection as well as security including key management (manage your keys  or authentication and certificates) and permissions, among other things.

 

Availability  = accessibility (can you get to your application and data) + durability (is the  data intact and consistent). This includes  basic Reliability, Availability, Serviceability (RAS), as well as high  availability, accessibility, and durability. “Durable”  has multiple meanings, so context is  important. Durable means how data infrastructure resources hold up to, survive,  and tolerate wear and tear from use (i.e., endurance), for example, Flash SSD or mechanical devices such as Hard Disk  Drives (HDDs). Another context for durable refers to data, meaning how many  copies in various places.

 

Server,  storage, and I/O network availability topics include:

  • Resiliency and self-healing to tolerate  failure or disruption
  • Hardware, software, and services  configured for resiliency
  • Accessibility to reach or be reached for handling work
  • Durability and consistency of  data to be available for access
  • Protection of data, applications, and assets including security

 

Additional server  I/O and data infrastructure along with storage topics include:

  • Backup/restore, replication,  snapshots, sync, and copies
  • Basic Reliability, Availability, Serviceability, HA, fail over, BC,  BR, and DR
  • Alternative paths, redundant components, and associated software
  • Applications that are fault-tolerant,  resilient, and self-healing
  • Non disruptive upgrades, code (application  or software) loads, and activation
  • Immediate data consistency and  integrity vs. eventual consistency
  • Virus, malware, and other data corruption or loss prevention

 

From a data protection standpoint, the fundamental rule or guideline is 4 3 2 1, which means  having at least four copies consisting of at least three versions (different  points in time), at least two of which are on different systems or storage  devices and at least one of those is off-site (on-line, off-line, cloud, or  other). There are  many variations of the 4 3 2 1 rule shown  in the following figure along with approaches on how to manage technology to  use. We will go into deeper this subject in later chapters. For now, remember the following.

 

large version application server storage I/O
4 3 2 1 data protection (via Software Defined Data Infrastructure  Essentials)

 

1    At  least four copies of data (or more), Enables durability in case a copy goes  bad, deleted, corrupted, failed device, or site.
2    The  number (or more) versions of the data to retain, Enables various recovery  points in time to restore, resume, restart from.
3    Data  located on two or more systems (devices or media/mediums), Enables protection  against device, system, server, file  system, or other fault/failure.

4    With  at least one of those copies being off-premise and not live (isolated from  active primary copy), Enables resiliency across sites, as well as space, time,  distance gap for protection.

Capacity and Space (What Gets Consumed and Occupied)

In  addition to being available and accessible in a timely manner (performance),  data (and applications) occupy space. That space is memory in servers, as well as using available consumable processor  CPU time along with I/O (performance) including over networks.

 

Data  and applications also consume storage space where they are stored. In addition to basic data space, there is also space  consumed for metadata as well as protection copies (and overhead), application  settings, logs, and other items. Another aspect of capacity includes network IP  ports and addresses, software licenses, server, storage, and network bandwidth  or service time.

 

Server,  storage, and I/O network capacity topics include:

  • Consumable time-expiring  resources (processor time, I/O, network bandwidth)
  • Network IP and other addresses
  • Physical resources of servers,  storage, and I/O networking devices
  • Software licenses based on  consumption or number of users
  • Primary and protection copies of  data and applications
  • Active and standby data infrastructure  resources and sites
  • Data  footprint reduction (DFR) tools and techniques for space optimization
  • Policies, quotas, thresholds,  limits, and capacity QoS
  • Application and database  optimization

 

DFR includes various techniques,  technologies, and tools to reduce the impact or overhead of protecting, preserving,  and serving more data for longer periods of time. There are many different  approaches to implementing a DFR strategy,  since there are various applications and data.

 

Common DFR  techniques and technologies include archiving, backup modernization, copy data management  (CDM), clean up, compress, and consolidate, data  management, deletion and dedupe, storage tiering, RAID (including parity-based, erasure codes , local reconstruction codes [LRC] , and Reed-Solomon , Ceph Shingled Erasure Code (SHEC ), among  others), along with protection configurations along with thin-provisioning,  among others.

 

DFR can be implemented in various  complementary locations from row-level compression in database or email to  normalized databases, to file systems, operating systems, appliances, and  storage systems using various techniques.

 

Also, keep in mind that not all data is the same; some is sparse, some is dense, some can be compressed  or deduped while others cannot. Likewise,  some data may not be compressible or dedupable.  However, identical copies can be  identified with links created to a common copy.

Economics (People, Budgets, Energy and other Constraints)

If one thing in life and  technology that is constant is change, then  the other constant is concern about economics  or costs. There is a cost to enable and maintain a data infrastructure on  premise or in the cloud, which exists to protect, preserve, and serve data and  information applications.

 

However, there should also be a benefit to having the data infrastructure  to house data and support applications that provide information to users of the  services. A common economic focus is what something costs, either as up-front  capital expenditure (CapEx) or as an operating expenditure (OpEx) expense,  along with recurring fees.

 

In general, economic considerations  include:

  • Budgets (CapEx and  OpEx), both up front and in recurring fees
  • Whether you buy,  lease, rent, subscribe, or use free and open sources
  • People time needed to integrate  and support even free open-source software
  • Costs including hardware,  software, services, power, cooling, facilities, tools
  • People time includes  base salary, benefits, training and education

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software defined data center (SDDC), software defined data infrastructures (SDDI)  and related topics via the following links:

 

https://storageioblog.com/data-infrastructure-primer-overview/

https://storageioblog.com/data-infrastructure-primer-overview/

https://storageioblog.com/data-infrastructure-primer-overview/

SDDC Data Infrastructure

 

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

 

https://storageio.com/book4.html

Software Defined Data Infrastructure Essentials Book SDDC

What this all means and wrap-up

Keep in mind that with Application Data Value Characteristics Everything Is Not The Same across various  organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. All applications have some element of performance, availability, capacity, economic (PACE) needs as well as resource demands. There is often a focus around data storage about storage efficiency and utilization which is where data footprint reduction (DFR) techniques, tools, trends and as well as technologies address capacity requirements. However with data storage there is also an expanding focus around storage effectiveness also known as productivity tied to performance, along with availability including 4 3 2 1 data protection. Continue reading the next post (Part III Application Data Characteristics Types Everything Is Not The Same) in this series here.

 

Ok, nuff said, for now.

 

Gs

Everything Is Not The Same Application Data Value Characteristics

 

This is part one of a five-part mini-series looking at Application Data Value Characteristics Everything Is Not The Same as a companion excerpt from  chapter 2 of my new book Software Defined Data Infrastructure  Essentials – Cloud, Converged and Virtual Fundamental Server Storage I/O  Tradecraft (CRC Press 2017). available at Amazon.com and other global venues. In  this post, we start things off by looking at general application server storage I/O characteristics that have an impact on data value as well as access.

 

Application Data Value Software Defined Data Infrastructure Essentials Book SDDC

 

Everything is not the same across different organizations including Information Technology  (IT) data centers, data infrastructures along with the applications as well as data they support. For example, there is so-called big data that can be many small files, objects, blobs or data and bit streams representing telemetry,  click stream analytics, logs among other information.

Keep in mind that applications impact how data is accessed, used, processed, moved and stored. What this means is that a focus on data value, access patterns, along with other related topics need to also consider application performance, availability, capacity, economic (PACE) attributes.

 

If everything is not the same, why is so much data along with many applications treated the same from a PACE perspective?

 

Data Infrastructure resources including servers, storage, networks might be cheap or inexpensive, however, there is a cost to managing them along with data.

 

Managing includes data protection (backup, restore, BC, DR, HA, security) along with other activities. Likewise, there is a cost to the software along with cloud services among others. By understanding how applications use and interact with data, smarter, more informed data management decisions can be made.

 

IT Applications and Data Infrastructure Layers
IT Applications and Data Infrastructure Layers

 

Keep in mind that everything is not the same across various  organizations, data centers, data  infrastructures, data and the applications that use them. Also keep in mind  that programs (e.g. applications) = algorithms (code) + data structures (how  data defined and organized, structured or unstructured).

 

There  are traditional applications, along with those tied to Internet of Things  (IoT), Artificial Intelligence (AI) and Machine Learning (ML), Big Data and  other analytics including real-time click stream, media and entertainment,  security and surveillance, log and telemetry processing among many others.

 

What  this means is that there are many different application with various character  attributes along with resource (server compute, I/O network and memory, storage  requirements) along with service requirements.

 

Common Applications Characteristics

Different  applications will have various attributes,  in general, as well as how they are used,  for example, database transaction  activity vs. reporting or analytics, logs and journals vs. redo logs, indices, tables, indices, import/export, scratch and temp space. Performance,  availability, capacity, and economics (PACE) describes the applications and  data characters and needs shown in the  following figure.

 

Application and data PACE attributes
Application PACE attributes (via Software Defined Data Infrastructure  Essentials)

 

All applications have PACE attributes, however:

  • PACE attributes vary by  application and usage
  • Some applications and their data  are more active than others
  • PACE characteristics may vary within different parts of an application

 

Think of applications along with associated data PACE as its  personality or how it behaves, what it does, how it does it, and when, along  with value, benefit, or cost as well as quality-of-service (QoS) attributes.

 

Understanding applications in different environments, including  data values and associated PACE attributes, is essential for making informed  server, storage, I/O decisions and data infrastructure decisions. Data  infrastructures decisions range from configuration to acquisitions or upgrades,  when, where, why, and how to protect, and how to optimize performance including  capacity planning, reporting, and troubleshooting, not to mention addressing  budget concerns.

 

Primary PACE attributes for active and inactive applications and data are:

P - Performance  and activity (how things get used)
A - Availability and durability (resiliency and data protection)
C - Capacity and space (what things use or occupy)
E - Economics  and Energy (people, budgets, and other  barriers)

 

Some applications need more performance (server computer,  or storage and network I/O), while others need space capacity (storage, memory,  network, or I/O connectivity). Likewise,  some applications have different availability needs (data protection,  durability, security, resiliency, backup,  business continuity, disaster recovery) that determine the tools, technologies, and techniques to use.

 

Budgets are also nearly always a concern, which for some applications means enabling more performance per cost while others are focused on maximizing space capacity and protection level per cost. PACE attributes also  define or influence policies for QoS (performance, availability, capacity), as well as thresholds, limits, quotas,  retention, and disposition, among others.

 

Performance and Activity (How Resources Get Used)

Some applications or components that comprise a larger solution will have more performance demands than others. Likewise,  the performance characteristics of applications along with their associated data will also vary. Performance applies to the server,  storage, and I/O networking hardware along with associated software and applications.

 

For servers, performance is focused on how much CPU  or processor time is used, along with memory and I/O operations. I/O operations to create, read, update, or delete  (CRUD) data include activity rate (frequency or data velocity) of I/O operations  (IOPS). Other considerations include the volume or amount of data being moved (bandwidth, throughput,  transfer), response time or latency, along with queue depths.

 

Activity is the amount of work to do or being done in a given amount of time (seconds, minutes, hours, days, weeks), which can be transactions, rates, IOPs. Additional performance considerations include latency, bandwidth, throughput, response time,  queues, reads or writes, gets or puts, updates, lists, directories, searches,  pages views, files opened, videos viewed, or downloads.
  
  Server,  storage, and I/O network performance include:

  • Processor CPU usage time and  queues (user and system overhead)
  • Memory usage effectiveness  including page and swap
  • I/O activity including between  servers and storage
  • Errors, retransmission, retries, and rebuilds

 

the  following figure shows a generic performance example of data being accessed  (mixed reads, writes, random, sequential, big, small, low and high-latency) on a local and a remote basis. The example  shows how for a given time interval (see lower right), applications are  accessing and working with data via different data streams in the larger image  left center. Also shown are queues and I/O handling along with end-to-end (E2E)  response time.

 

fundamental server storage I/O
Server I/O performance  fundamentals (via Software Defined  Data Infrastructure Essentials)

 

Click here to view a larger version of the above figure.

 

Also shown on the left in the above figure is an example of  E2E response time from the application through the various data infrastructure  layers, as well as, lower center, the response time from the server to the memory  or storage devices.

 

Various queues are shown in the middle of  the above figure which are indicators of how much work is occurring, if the processing is keeping up with  the work or causing backlogs. Context is  needed for queues, as they exist in the  server, I/O networking devices, and software drivers, as well as in storage  among other locations.

 

Some  basic server, storage, I/O metrics that matter include:

  • Queue depth of I/Os waiting to be processed and concurrency
  • CPU and memory usage to process  I/Os
  • I/O size, or how much data can be moved in a given operation
  • I/O activity rate or IOPs =  amount of data moved/I/O size per unit of time
  • Bandwidth = data moved per unit  of time = I/O size × I/O rate
  • Latency usually increases with  larger I/O sizes, decreases with smaller requests
  • I/O rates usually increase with  smaller I/O sizes and vice versa
  • Bandwidth increases with larger  I/O sizes and vice versa
  • Sequential stream access data  may have better performance than some random access data
  • Not all data is conducive to  being sequential stream, or random
  • Lower response  time is better, higher activity rates and bandwidth are better

 

Queues  with high latency and small I/O size or small I/O rates could indicate a  performance bottleneck. Queues with low latency and high I/O rates with good bandwidth  or data being moved could be a good  thing. An important note is to look at several metrics, not just IOPs or  activity, or bandwidth, queues, or response time. Also, keep in mind that metrics that matter for your environment  may be different from those for somebody else.

 

Something to keep in perspective is that there can be a large amount  of data with low performance, or a small  amount of data with high-performance, not to mention many other variations. The  important concept is that as space capacity scales, that does not mean  performance also improves or vice versa, after all, everything is not the same.

Where to learn more

Learn more about Application Data Value, application characteristics, PACE along with data protection, software defined data center (SDDC), software defined data infrastructures (SDDI)  and related topics via the following links:

https://storageioblog.com/data-infrastructure-primer-overview/

 

SDDC Data Infrastructure

 

Additional learning experiences along with common questions (and answers), as well as tips can be found in Software Defined Data Infrastructure Essentials book.

 

Software Defined Data Infrastructure Essentials Book SDDC

What this all means and wrap-up

Keep in mind that with Application Data Value Characteristics Everything Is Not The Same across various  organizations, data centers, data infrastructures spanning legacy, cloud and other software defined data center (SDDC) environments. However all applications have some element (high or low) of performance, availability, capacity, economic (PACE) along with various similarities. Likewise data has different value at various times. Continue reading the next post (Part II Application Data Availability  Everything Is Not The Same) in this five-part mini-series here.

 

Ok, nuff said, for now.

Gs

VMware continues cloud construction with March announcements

VMware continues cloud construction sddc

 

VMware continues cloud  construction with March announcements of new features and other enhancements.

 

VMware continues cloud construction SDDC data infrastructure strategy big picture
VMware Cloud Provides Consistent Operations and Infrastructure Via: VMware.com

 

With its recent announcements, VMware continues cloud construction adding new features, enhancements, partnerships along with services.

 

VMware continues cloud construction, like other vendors and service providers who tried and test the waters of having their own public cloud, VMware has moved beyond its vCloud Air initiative selling that to OVH. VMware which while being a public traded company (VMW) is by way of majority ownership part of the Dell Technologies family of company via the 2016 acquisition of EMC by Dell. What this means is that like Dell Technologies, VMware is focused on providing solutions and services to its cloud provider partners instead of building, deploying and running its own cloud in competition with partners.

 

VMware continues cloud construction SDDC data infrastructure strategy layers
VMware Cloud Data Infrastructure and SDDC layers Via: VMware.com

 

The VMware Cloud message and strategy is focused around providing software solutions to cloud and other data infrastructure partners (and customers) instead of competing with them (e.g. divesting of vCloud Air, partnering with AWS, IBM Softlayer). Part of the VMware cloud message and strategy is to provide consistent  operations and management across clouds, containers, virtual machines (VM) as well as other  software  defined data center (SDDC) and software defined data infrastructures.

 

In other words, what this means is VMware providing consistent management to  leverage common experiences of data infrastructure staff along with resources in a  hybrid, cross cloud and software defined environment in support of existing as  well as cloud native applications.

 

VMware continues cloud construction on AWS SDDC
VMware Cloud on AWS Image via: AWS.com

 

Note that VMware  Cloud services run on top of AWS EC2 bare metal (BM) server instances, as  well as on BM instances at IBM softlayer as well as OVH. Learn more about AWS  EC2 BM compute instances aka Metal as a Service (MaaS) here.  In addition to AWS, IBM and OVH, VMware claims over 4,000 regional cloud and  managed service providers who have built their data infrastructures out using  VMware based technologies.

 

VMware continues cloud construction updates

Building off of previous  announcements, VMware continues cloud construction with enhancements to  their Amazon  Web Services (AWS) partnership along with services for IBM Softlayer cloud  as well as OVH. As a refresher,  OVH is what formerly was known as VMware vCloud air before it was sold off.

 

Besides expanding on existing cloud partner solution  offerings, VMware also announced additional cloud, software defined data center  (SDDC) and other software  defined data infrastructure environment management capabilities. SDDC and  Data infrastructure management tools include leveraging VMwares  acquisition of Wavefront among others.

 

VMware Cloud Updates and New Features

  • VMware Cloud on AWS European regions (now in  London, adding Frankfurt German)
  • Stretch Clusters with synchronous replication for  cross geography location resiliency
  • Support for data intensive workloads including  data footprint reduction (DFR) with vSAN based compression and data  de duplication
  • Fujitsu services offering relationships
  • Expanded VMware Cloud Services enhancements

 

VMware Cloud Services enhancements include:

  • Hybrid Cloud Extension
  • Log intelligence
  • Cost insight
  • Wavefront

VMware Cloud in additional AWS Regions

As part of service expansion, VMware Cloud on AWS has been  extended into European region (London) with plans to expand into Frankfurt and an Asian Pacific location.  Previously VMware Cloud on AWS has been available in US West Oregon and US East  Northern Virginia regions. Learn more about AWS Regions and availability zones (AZ) here.

 

VMware Cloud Stretch Cluster
VMware Cloud on AWS Stretch Clusters Source: VMware.com

 

VMware Cloud on AWS Stretch Clusters

In addition to expanding into additional regions, VMware  Cloud on AWS is also being extended with stretch clusters for geography  dispersed protection. Stretched clusters provide protection against an AZ  failure (e.g. data center site) for mission critical applications. Build on  vSphere HA and DRS  automated host  failure technology, stretched clusters provide recovery point objective zero (RPO 0) for continuous protection, high availability across AZs at the data infrastructure layer.

 

The benefit of  data infrastructure layer based HA and resiliency is not having to re architect or  modify upper level, higher up layered applications or software. Synchronous  replication between AZs enables RPO 0, if one AZ goes down, it is treated as a  vSphere HA event with VMs restarted in another AZ.

 

vSAN based Data Footprint Reduction (DFR) aka Compression  and De duplication

To support applications that leverage large amounts of data, aka data intensive applications in marketing speak, VMware is leveraging vSAN based data footprint reduction (DFR) techniques including compression as well as de duplication (dedupe). Leveraging DFR technologies like compression and dedupe integrated into vSAN, VMware Clouds have the ability to store more data in a given cubic density. Storing more data in a given cubic density  storage efficiency (e.g. space saving utilization) as well as with performance acceleration, also facilitate storage effectiveness along with productivity.

 

With VMware vSAN technology as one of the core underlying technologies for enabling VMware Cloud on AWS (among other deployments), applications with large data needs can store more data at a lower cost point. Note that VMware Cloud can support 10 clusters per SDDC deployment, with each cluster having 32 nodes, with cluster wide and aware dedupe. Also note that for performance, VMware Cloud on AWS leverages NVMe attached Solid State Devices (SSD) to boost effectiveness and productivity.

 

VMware Hybrid Cloud Extension
Extending VMware vSphere any to any migration across clouds  Source: VMware.com

 

VMware Hybrid Cloud Extension

VMware Hybrid Cloud Extension enables common management of common underlying data infrastructure as well as software defined environments including across public, private as well as hybrid clouds. Some of the capabilities include enabling warm VM migration across various software defined environments from local on-premise and private cloud to public clouds.

 

New enhancements leverages previously available technology now as a service for enterprises besides service providers to support data center to data center, or cloud centric AZ to AZ, as well as region to region migrations. Some of the use cases include small to large bulk migrations of hundreds to thousands of VM move and migrations, both scheduling as well as the actual move. Move and migrations can span hybrid deployments with mix of on-premise as well as various cloud services.

 

VMware Cloud Cost Insight

VMware Cost Insight enables analysis, compare cloud costs across public AWS, Azure and private VMware clouds) to avoid flying blind in and among clouds. VMware Cloud cost insight enables awareness of how resources are used, their cost and benefit to applications as well as IT budget impacts. Integrates vSAN sizer tool along with AWS metrics for improved situational awareness, cost modeling, analysis and what if comparisons.

 

With integration to Network insight, VMware Cloud Cost Insight also provides awareness into networking costs in support of migrations. What this means is that using VMware Cloud Cost insight you can take the guess-work out of what your expenses will be for public, private on-premises or hybrid cloud will be having deeper insight awareness into your SDDC environment. Learn more about VVMware Cost Insight here.

 

VMware Log Intelligence

Log Intelligence is a new VMware cloud service that provides real-time data infrastructure insight along with application visibility from private, on-premise, to public along with hybrid clouds. As its name implies, Log Intelligence provides syslog and other log insight, analysis and intelligence with real-time visibility into VMware as well as AWS among other resources for faster troubleshooting, diagnostics, event correlation and other data infrastructure management tasks.

 

Log and telemetry input sources for VMware Log Intelligence include data infrastructure resources such as operating systems, servers, system statistics, security, applications among other syslog events. For those familiar with VMware Log Insight, this capability is an extension of that known experience expanding it to be a cloud based service.

 

VMware Wavefront SaaS analytics
Wavefront by VMware Source: VMware.com

 

VMware Wavefront

VMware Wavefront enables monitoring of cloud native high scale environments with custom metrics and analytics. As a reminder Wavefront was acquired by VMware to enable deep metrics and analytics for developers, DevOps, data infrastructure operations as well as SaaS application developers among others. Wavefront integrates with VMware vRealize along with enabling monitoring of AWS data infrastructure resources and services. With the ability to ingest, process, analyze various data feeds, the Wavefront engine enables the predictive understanding of mixed application, cloud native data and data infrastructure platforms including big data based.

 

Where to learn more

Learn more about VMware, vSphere, vRealize, VMware Cloud, AWS (and other clouds), along with data protection, software defined data center (SDDC), software defined data infrastructures (SDDI)  and related topics via the following links:

SDDC Data Infrastructure

 

Additional  learning experiences along with  common questions (and answers), as well as  tips can be found in  Software Defined Data Infrastructure Essentials book.

 

Software Defined Data Infrastructure Essentials Book SDDC

What this all means and wrap-up

VMware continues cloud construction. For now, it appears that VMware like Dell Technologies is content on being a technology provider partner to large as well as small public, private and hybrid cloud environments instead of building their own and competing. With these series of announcements, VMware continues cloud  construction enabling its partners and customers on their various software defined data center (SDDC) and related data infrastructure journeys. Overall, this is a good set of enhancements, updates, new and evolving features for their partners as well as customers who leverage VMware based technologies. Meanwhile VMware continues cloud construction.

 

Ok, nuff said, for now.

Gs