Tanita 2015

Fitness Goals for 2015

Here I am again gaining back my momentum where I fell off maintaining a healthier lifestyle two years ago. This time, for sure, I will stick to it until I reach the 3rd phase of my target and maintain it. Besides, I will be the loser again if I cannot reach nor maintain this.

Since Wordpress does not allow script embedding, you need to click on my body composition dashboard so that you can interactively look at my data.

FoxyReign's Body Composition

I used moving average to smooth the data because I do not feel that a linear trend on a time-series model is not appropriate given that I have very few data points; maybe after three months of tracking, I will include that.

What really concerns me is that I see a statistically significant correlation between the increase in my fat and fat-free muscle mass. In due time, this has to change – let it be flat or better negatively correlated variables. Ill definitely look very muscular if that happens.

Sine Plot of Global Temperature of Earth

The Earth has lived more than 4.5 billion years ago and with the evolution of man that just recently took place based on its age, we have very scarce information on what happened on pre-historic time except the remains of the fossil, rocks and other geological antiques available for study. Though still in the very short span of manhood on this planet so far, the global temperature has been going up and down.

Global Temperatures

Global Temperatures

I found this image in one of the discussions in the Analytics group on LinkedIn. You can see an irregular but  seasonal sine of wave of the global temperature of earth when plotted on a time-series horizontal scale. The memories of the eruption of Mount Pinatubo in Pampanga still linger at the back of my head; I think I was on my first few years of primary school when the volcano erupted which destroyed thousands of hectares of field crops and even reached its sulfuric ashes in some provinces in the Visayas.

I am not saying that the contributing factor to the cooling down of the planet is mainly caused by the volcanic eruption but considering the major contributors of warming are solar activities, weather conditions, man-made pollution and and oceanology temperature pattern.

Like I have learned in my Natural Sciences subjects, it is hard to predict when will the volcano erupt. Statistically speaking, if you would perform an additive smoothing or even ARIMA based on the data points in the plot above, the next volcano eruption that may lead to sudden decrease in the global temperature may happen soon. Why?

The gap between the highs and lows in the plots are a few centuries apart. Due to the pollution that greatly affect the warming of the planet, the process of cooling down becomes shorter in span. Mother Earth needs to cool down soon.

Personality Test by Talentoday.com

Psychometrics has always been an amazing field of Psychology and I have always been inspired to analyze the collected data trying to understand how human behavior works, in Organizational Psychology, at least. Talentoday offers this win-win situation where they can collect data for their research and you, as the inquisitor, can explore how you behave especially in your professional career.

Talentoday

The test is calculated by standardization of the norm making sure that across the test takers, the z-score and stern scores are used to interpret the results. Also, I noticed that there is a pattern in the questions which is usually frequent in these kinds of tests where validity is measured to predict the outcome based on the indicators. The rest of the data collection and calibration process are available in the website once you have finished the exam. The radar chart above is interactive.

The exam result is divided into 5 different clusters. The ordering are based on my results in descending order where my highest score is 6.5 and the lowest is 4. These clusters are the main personality dimensions which are proven to be essential in the professional world.

  • Dare
  • Excel
  • Manage
  • Adapt
  • Communicate

Also, the exam prepares a motivation scale – the things that drive you to achieve your goals and the things you may need to work on. Obviously mine is communication. Each of the clusters mentioned above earns an ordinal rating from 1 to 10 giving you more meaning to your score on a lower-level.

Talentoday

Lastly, you will be presented with your talent ID enlisting your empowering attitude that makes you unique.

Talentoday

Once done, you can even receive a PDF of the summary and detailed report based on your answers. After 6 months, you can reassess, as part of the test and retest method, if there’s any change in your outlook. Although I have not tried comparing my results among my friends nor the people who have taken the test within the same organization. If there’s an option to compare the results within the same industry or job specification, that’s going to be interesting. I guess that they’re still collecting more data regardless if these have not been scientifically gathered because of dependency and deviance from randomness.

Join and take your personality exam. Let’s compare!

qqplot

T-test of Parametric Test of Paired Data of the Null Hypothesis

I was working on a sample data last Friday and testing if it is really worth looking or spending time because someone has requested for an analysis that I have revised a lot of times and one of the frustrations that I have been encountering so far is to translate these statistical tests into business language. That is another topic that I need to rant on.

Anyway, like I mentioned that two separate data were collected. You would think that these as pre and post tests, in a sense but the data’s background is that it was measured again after two weeks. I will start of in encoding these into R.

# Load ggplot2 package. Install this if necessary:
# install.packages("ggplot2")
library(ggplot2)

# Creating Dataframe of Paired Data
test.data <- data.frame(Test = as.character(c(1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 
    2, 2, 2, 2, 2, 2, 2, 2, 2, 2)), Score = c(0.54, 0.573, 0.575, 0.589, 0.639, 
    0.624, 0.64, 0.565, 0.694, 0.605, 0.632, 0.535, 0.556, 0.533, 0.516, 0.575, 
    0.57, 0.608, 0.58, 0.502))

As usual, I am a fan of the subset function. I could use the open square brackets, but I am very comfortable in using this; it takes the job done.

# Subset
test1 <- subset(test.data, Test == 1)
test2 <- subset(test.data, Test == 2)

Now that we have subset the data. Let us look how far they are to each other. Most people are intimidated looking at these boxplots. I will not discuss further how to read and interpret these but you can actually see the difference between the mean, which is the small dot in between the boxes, and the median, the straight line across each box.

My question is, are these two data sets statistically significant to say that they are different to each other?

ggplot(data = test.data, aes(x = Test, y = Score)) + stat_boxplot(geom = "errorbar") + 
    geom_boxplot(aes(fill = Test)) + stat_summary(fun.y = mean, geom = "point", 
    aes(group = 1)) + ylab("Scores") + xlab("Test") + theme(legend.position = "none")

qqplot
Of course, let us rely on the simple Student’s test of Paired Data.

t.test(x = test1$Score, y = test2$Score, alternative = "two.sided", paired = T, 
    conf.level = 0.95)
## 
##  Paired t-test
## 
## data:  test1$Score and test2$Score
## t = 2.018, df = 9, p-value = 0.07432
## alternative hypothesis: true difference in means is not equal to 0
## 95 percent confidence interval:
##  -0.00528  0.09268
## sample estimates:
## mean of the differences 
##                  0.0437

If you need me to compute these manually, I would love to. Starting from the standard deviation of differences of the two means to standard error, the degrees of freedom, until we arrive at the p-value according to t-test value. If I would plot this on a normal curve, the end point of t test value of 2.018 in a 9 degrees of freedom, the probability is 0.07.

Even on a 95% confidence level, I could say that they are not different to each other basing it on the p-value of more than 0.05. Why? Let’s construct the hypothesis statement first.

H0 = Test 1 = Test 2, Test 1 and Test 2 are equal to each other on a two-sided tail
HA = Test 1 ≠ Test 2, Test 1 and Test 2 are not equal to each other on a two-sided tail

Given that the p-value of 0.07, where the significance level is at 0.05 cut off, we retain the null hypotheses. Therefore, we conclude that these two tests are equal to each other. With all of these languages I speak, what do they really mean?

If you look into both means or averages of the two data sets, they are different. 0.6044 and 0.5607, respectively. I can say that the request I am working on is not worth looking at into a lower level. This is where decision error takes in place of what would be the implication if I continue looking for answers or I just decide not to because it is not worth looking at. Decision Errors is another topic, maybe in the next few posts.