Gaining insight from the scatter plot

At this stage, in a few simple steps, we have configured a very useful visualization that matches the power of the visualizations shown by Hans Rosling. I'm now going to show you how to use its best advantages:

  1. Experiment with adjusting the Marker by hierarchy slider. Notice how it changes the visualization. Region-Sub-region is shown here:
  1. Change the hierarchy slider to Region-Sub-region-Entity:

It's really interesting to see the clusters of points by continent. Experiment with the Region-Sub-region-Country hierarchy slider. Do you get a better view of how the regions compare in any particular view?

More on aggregation: N otice that when you adjust the hierarchy slider, the number of points shown on the scatter plot changes. Spotfire is dynamically recalculating the aggregation of the values for the x-, y- an d size-by axes (yes, size-by is considered to be an axis!) based on the grouping of the data. The x- and y- axes are using the Avg function to calculate the average; the size-by axis is using the Sum function. This makes sense in our case, as we always want to know the average rates of population growth and child mortality across a country, region, or continent, but we also always need to know the total population of that country, region, or continent.

Note that the developing world looks very different from the other parts of the world. Asia is also interesting to look at and the Americas give a very mixed picture.

Experiment with marking (selecting) various regions of the plot. Labels will be shown on the marked points because we set that option in the Properties dialog. It's not perfect, because the label is fixed to Entity, but it gives you a good idea. For an experiment, you could try using the UniqueConcatenate aggregation method for the labelthat way, if you are showing a region or sub-region and you mark a point on the scatter plot, you will see a list of all the countries contained in the point. Right now, you just get the first one as that is the default aggregation method for categorical columns:

Recall that the scatter plot is showing four dimensions of data all at once: child mortality, population increase, population size, and country/region/continent. It's possible to visualize a fifth dimension (time) by moving through the Year filter. Open the Year filter from the data panel as before, and try clicking through the various years, one by one. You will see the scatter plot animate and show the time dimension.

You can play around here to reveal various depictions in the visualization:

  • Click through the years in the filter and watch the various countries and regions move along the axes, jostle for position in the world order, and so on.
  • Look out for major changes over short periods of timekeep an eye on countries such as China and Indiathey have had enormous population growth and massive reductions in infant mortality.
  • Note the overall trend of the world. Better health care and education (we infer) over time has given rise to an overall positive trend in infant mortality and a reduction in natural population growth.

Recall that we set the limits of the x- and y-axes in the properties of the visualization so that we could use the filter to animate the visualization. If the axes had been left to auto-scale, they would have readjusted every time the filter was changed, which would not have shown a consistent picture of the data over each year.

This would be a good point to save the analysis file, if you haven't done so already.