jlbriggs
Registered:1282229480 Posts: 191
Posted 1470313799
Reply with quote
#16
Ok, I can see the point now. I am not sure I agree that it has to be one way or the other, or that one way means one thing, and the other means another thing, necessarily. As far as the markers on the lines go, I could go either way. When there are multiple lines to compare to each other, the markers can be helpful to make clear which points relate to each other, over what point on the axis. In this case, the lines are few and simple enough that they may not be needed, but also don't really hurt anything.

danz
Registered:1348995178 Posts: 182
Posted 1470323024
Reply with quote
#17
@jlbriggs
Obviously is a minor matter.
Time related scales are somehow different than numerical scales because unlike numerical values clearly represented by a point on axis, a time reference (day, week, month, year) has also a length. This is why I consider that ticks on a time scale are just fine as they were designed. It is also true that if the scale would have been designed as Bella suggested wouldn't hurt either.

sfew
Moderator
Registered:1135986598 Posts: 803
Posted 1470325723
Reply with quote
#18
I'd like to clear up some confusion about terms. The scale that is sed for a time series is a categorical scale. There are different types of categorical scales. I identify three types of categorical scales in my work: nominal, ordinal, and interval. Technically, there is a fourth type--ratio--which is so similar to an interval scale that I don't usually discriminate between them. An interval scale results from taking a quantitative scale and breaking it into a series of equal intervals. When we break time into equal intervals (years, months, weeks, days, etc.), we produce a categorical scale of the interval variety. When, on the other hand, we use time to measure lengths of time, such as the lengths of airline flights, the scale is quantitative. Any time the scale is used to measure things, it is quantitative.
__________________ Stephen Few

acraft
Registered:1306510245 Posts: 51
Posted 1470330436
Reply with quote
#19
Quote:

Originally Posted by danz Time related scales are somehow different than numerical scales because unlike numerical values clearly represented by a point on axis, a time reference (day, week, month, year) has also a length.

While days and weeks are almost always the same length, months and years vary in length. Is the variation in each month's length being represented? Or are the months just being rendered as the same length (this is what it looks like, and is the way most BI software does it)? If the latter, then it indicates the length is irrelevant, so I have to wonder why it's represented at all. Unless the actual length of a time interval is important in understanding the visualization, I'd suggest it's best not to represent it. Personally, I prefer Bella's version. Now, if I were showing weekly values on a chart where I had labels on the months, I'd absolutely use the axis as currently designed with ticks marking the month boundaries, but with correct month lengths.

sfew
Moderator
Registered:1135986598 Posts: 803
Posted 1470331540
Reply with quote
#20
While it is true that some intervals of time are not exactly equal, espeically months, which vary by a few days, with time series we typically treat them as equal, assuming that for our purposes the variation is insignificant. Regarding the shading of months, in addition to the fact that it suggests that the months represent equal intervals, assuming that the shaded regions are equal in width, I would usually avoid this practice for another reason: it adds a great deal of ink to the display that serves no purpose. If more than the tick marks along the scale are needed to support easy comparisons of values among multiple lines in a line graph, small data points along the lines support this more effectively.
__________________ Stephen Few

acraft
Registered:1306510245 Posts: 51
Posted 1470333779
Reply with quote
#21
Quote:

Originally Posted by sfew While it is true that some intervals of time are not exactly equal... with time series we typically treat them as equal, assuming that for our purposes the variation is insignificant.

More or less what I said. Although, as I mentioned, if I am plotting weekly values on a time series, and if I'm labeling the months on the axis, the month lengths should *not* be treated as equal, or the intervals between weekly datapoints will be misrepresented.

sfew
Moderator
Registered:1135986598 Posts: 803
Posted 1470335417
Reply with quote
#22
Acraft, I clarified the fact that a time series is a categorical scale in response to a statement by danz to the contrary. I understand your concern about month labeling when the intervals are weeks, but this concern doen't arise if you label each weekly interval as a date range (Jul 24 - Jul 30, Jul 31 - Aug 6, etc.) rather than marking the beginning and endings of months independently from the weeks.
__________________ Stephen Few

danz
Registered:1348995178 Posts: 182
Posted 1470339915
Reply with quote
#23
Stephen,
Is no doubt a sequence Jan, Feb, Mar, ... can be considered a categorical scale. As would be Jan, Mar, May, Jul, Sep, Nov.
Date-time as value is considered a number and mathematical operations that make sense (addition and differences) are perfectly defined on time axis as they are defined on numerical axis. In Excel, for instance 0 is considered by convention "1 Jan 1900 0:00:00". Unit (1) is considered the day length, an hour is 1/24, etc.
The way I see it a time scale design involves an origin, a contiguous interval divided in time related resolution (minutes, hours, days, weeks, months, years). Such of scale makes perfect sense for time evolution graphs where values are aggregated on resolution interval level. The fact that the months or years are not perfectly equal as time interval, is not that important.
A time scale represents a contiguous interval just like a numerical scale. The resolution intervals are discrete values as they are on a categorical scale.
Considering similarities and differences with both numerical and categorical scales, I think that a contiguous time scale should be considered a different type instead of trying to fit in any of the above categories. Certainly from a programming point of view the design of a time scale requires different approach than numerical scale and categorical scale.
Dan

sfew
Moderator
Registered:1135986598 Posts: 803
Posted 1470341322
Reply with quote
#24
Dan, As an interval scale, a time series is already different from other types of categorical scales. An interval scale, however, is not limited to time, for any range of quantitative values can be broken into intervals of equal size and thus treated as an interval scale. For example, people's ages broken into 10-year intervals, such as 10-19, 20-29, 30-39, etc., is an interval scale. What's unique about a time series is the fact that the intervals into which time is usually divided involve a standard set (decades, years, quarters, months, weeks, days, hours, etc.), and some of these standard intervals of time have standard names (e.g., the month names of January, February, etc.) and they relate to one another hierarchically (e.g., years consist of twelve months, quarters consist of three months, and so on). An understanding of these names and relationships are sometimes incorporated into software to make particular time-related operations easy to perform, such as sorting month names in chronological order or drilling up and down through time hierarchically. Otherwise, a time series is just like other categorical scales of the interval variety. Do you think that a time series is different in other ways?
__________________ Stephen Few

danz
Registered:1348995178 Posts: 182
Posted 1470390420
· Edited
Reply with quote
#25
A time series suppose to show a set of values recorded or aggregated for successive predefined intervals in a contiguous period of time. For this we need to design a proper time scale. A time scale used for a contiguous time interval can have different resolution representations for the same interval without altering the sense of the graph. In a similar way a numerical scale can use 0,1,2,3,4,5,6,7,8,9,10 or 0,2,4,6,8,10 without altering the sense of the graph. Any other categorical scale does not have an alternative representation, the category label and the correspondent encoded value have to be clearly aligned. Below two examples, where due space constraints, the time scale does not display the labels associated with the encoded values, yet the charts make sense. The graphs show two different ways of encoding time on axis, one by showing interval labels, the other one having the labels on ticks.

An interval scale does not always imply equal intervals. The intervals can be sometimes very different, they are defined based on the field data logic (see below). A time scale resolution can have only small variations like January vs. February, leap vs no leap years, so usually they can be considered equal in graphical representations. (below some images quickly found on web, displayed only for intervals length purposes only, ignore the rest of the design)

A time scale is usually designed horizontal, all the other categorical scales can be designed both horizontal or vertical. A line chart makes very much sense with a time scale showing continue (sometimes with interruptions) evolution of certain variables, while a line chart associated with any other categorical scale can be used for frequency polygons only.

Considering already mentioned differences that a time scale has predefined fixed hierarchies and standard (limited) resolution intervals while the other categorical scales do not share similar constraints, I think they are enough reasons to have a different chapter for time series and its associated time scale. Dan

acraft
Registered:1306510245 Posts: 51
Posted 1470420167
Reply with quote
#26

Quote:

Originally Posted by sfew I clarified the fact that a time series is a categorical scale in response to a statement by danz to the contrary.

Yes I saw your comment explaining this and I agree with it completely. I too was responding to danz, by suggesting that the lengths of time intervals should probably not be visually represented on an axis* unless they are relevant, in which case they should be drawn accurately. *By this, I mean specifically identifying the starts and ends of each interval through labels or ticks (i.e. Joe's version) vs. evenly distributed labels denoting a categorical scale (i.e. Bella's version).Quote:

Originally Posted by sfew I understand your concern about month labeling when the intervals are weeks, but this concern doen't arise if you label each weekly interval as a date range (Jul 24 - Jul 30, Jul 31 - Aug 6, etc.) rather than marking the beginning and endings of months independently from the weeks.

It wasn't so much a concern as it was an example. Your alternate solution works fine, assuming there is enough space for such labels (if not, I'd still use tooltips to denote date ranges). A better example would be a Gantt chart, where lengths of time intervals (such as months) are certainly relevant, must be visually represented, and most importantly must be drawn accurately. But this sub-thread is starting to get too far off-topic, so I'll stop there.

sfew
Moderator
Registered:1135986598 Posts: 803
Posted 1470585012
· Edited
Reply with quote
#27
Dan, Almost everything that you've said about a time series applies to interval scales in general--not just time. With any interval scale the intervals for which values appear in the graph can be smaller than the intervals that are labeled along the scale. For example, with a scale based on people's ages, you could label ten-year intervals by provide values for two-year intervals. I would not recommend doing this ordinarily, but you could. Also, a line graph works well for all interval scales, not just a time series. Your statement that an interval scale can have intervals of unequal lengths is not correct. For example, if the scale consists of age ranges, if the age ranges are not equal in size, by definition that would not be an interval scale; it would be an ordinal scale.
__________________ Stephen Few

danz
Registered:1348995178 Posts: 182
Posted 1470832358
Reply with quote
#28
Nominal, ordinal and interval scales Thank you, Stephen, for mentioning me about the adapted S.S. Stevens scales of measurement used in you work. Without being aware of Stevens definition, I did consider an interval scale just a collection of ordered and connected intervals, not necessarily restricted to the equality condition, therefore assimilated by me as an ordinal scale only and consequently not suitable for line graphs. As Stevens considers a scale interval of equal size only, I totally agree that a time scale can be considered a variation of an interval scale and that a line graph makes perfect sense on intervals as they do in time series. Thank you, again, for clarifying this.

The nominal, ordinal and interval scales defined by Stevens were used in your work to define the categorical information as fundamental notion for equally spaced designed graphs as bar or line graphs. Probably influenced by my education (engineering, databases, programming), I always saw the like of equal interval scale, time scale, but also logarithmic interval scale (or other accepted scale transformations like power or root) designed as numerical scales.

"A time series is a categorical scale" A categorical information can be any criteria that are equally relevant against aggregated or measured data, including nominal, ordinal, interval (including time interval) but also just numbers (usually considered within certain precision). For the purpose of this post I would like to focus on time series as a variation of interval scales. The following picture was posted by me on this forum some time ago as a comment to a matter raised by a participant regarding the irregular intervals used in a time line related graph. A quick search for time series gives me this definition, which more or less explains his/her concern: "A time series is a sequence of numerical data points in successive order, usually occurring in uniform intervals. In plain English, a time series is simply a sequence of numbers collected at regular intervals over a period of time". While I admit that equal time intervals have a decisive role in statistics, including forecast, I see no reason to consider above a wrong designed line graph. Does the above time information fit into a categorical scale? We can say that the election years (unevenly spread) can fit into an ordinal scale, true. For which we cannot design a line graph. Yet the designed graph looks perfectly valid to me and it has not much in common with the "categorical" sense, never mind the equal intervals. The time axis and the graph are designed just the same for uneven spread events as they would be for events that take place every decade and would be conventionally displayed in center. The slopes make sense for both even and uneven spread events. As I admitted before, any information can be used to group and aggregate data. When we do that, we consider the criteria equally relevant therefore we design equally spaced graphs. If this works fine for bar charts, for line charts the quantitative information contained in time related sets (displayed on horizontal axis) requires extra attention. A design of the time axis as above works well for unevenly spread events and equal connected intervals. In both cases we would have valid line charts. For a line chart slopes makes sense, so both vertical and horizontal axis have a quantitative sense, including the time scale. Dan

jrodriguez
Registered:1465324029 Posts: 18
Posted 1471010522
Reply with quote
#29
Thanks for the discussion guys, I'm learning a lot from it. To jlbriggs earlier point, I use very subtle color banding to guide users down to the specific numbers below if they need specifics. More importantly, I usually stack various monthly metrics like the example below, which help users go down a specific month a see multiple numbers a little easier. Again, open to feedback but so far the comments have been pretty positive around this layout.

acraft
Registered:1306510245 Posts: 51
Posted 1471017140
Reply with quote
#30
Quote:

Originally Posted by jrodriguez ...I use very subtle color banding to guide users down to the specific numbers below if they need specifics.

I actually think this is an acceptable approach. I don't find the color banding is necessary to guide the user, but I don't personally find it distracting, nor do I get the feeling that it implies that the starts/ends of intervals are noteworthy (as the bordering tick marks did). If it were me though, I'd still remove the color banding and put tick marks centered above each month label.