tag:blogger.com,1999:blog-535632990291000665.post7147490717053700611..comments2017-09-14T09:18:08.442-04:00Comments on everyday analytics: PCA and K-means Clustering of Delta AircraftMyles Harrisonhttps://plus.google.com/101316608244686723625noreply@blogger.comBlogger5125tag:blogger.com,1999:blog-535632990291000665.post-12552909783728551972015-04-15T11:10:47.302-04:002015-04-15T11:10:47.302-04:00Glad you enjoyed it, Tim. Happy to help.
My under...Glad you enjoyed it, Tim. Happy to help.<br /><br />My understanding of PCA is that it accounts for collinearity by picking a combinations of variables for each principal component such that the maximum amount of variance in the data is explained by them and they are orthogonal. In fact, if you read the <a href="http://en.wikipedia.org/wiki/Principal_component_analysis" rel="nofollow">definition of PCA</a>, you'll see that dealing with correlated and collinear variables is really much of the motivation behind the method. There's also some discussion on your concerns <a href="http://stats.stackexchange.com/questions/9542/is-pca-unstable-under-multicollinearity" rel="nofollow">here</a>, but let me know if you find any more documentation or a reference that suggests otherwise.<br /><br />I will freely admit that there are flaws in my analysis (which others have pointed out). As PCA picks components that explain the maximum amount of variability, it should have been no surprise that the seat class variables became the determining factor. Using dummy variables is appropriate for methods like linear regression but not for PCA, as the dummy variables will always take maximum values (0 or 1) whereas the scaled numerical variables will lie in the range, a fact which I did not consider at the time but which other readers have since pointed out.Myles Harrisonhttps://www.blogger.com/profile/03459119352415846492noreply@blogger.comtag:blogger.com,1999:blog-535632990291000665.post-51195236867876859552015-04-13T14:54:16.861-04:002015-04-13T14:54:16.861-04:00Excellent! I cloned your repo and followed along i...Excellent! I cloned your repo and followed along in R. Simply excellent, thank you. You've saved me oodles of time in figuring out all those packages also. And rgl is cool too :)<br /><br />Do you know if it would've been advisable to remove some of the colinear variables before performing PCA? For example, there are several variables that grow with the physical size of the plane and one of these would've captured enough of the information. Same thing with 'luxury' attributes. When I did 'loadings(pc)' on the version of the pc variable as defined in line 33, I couldn't see anything that explained the PC ellipsoid, except perhaps that the colinear variables seemed to have similar weights. Perhaps I'll do it and see what happens<br /><br />I too am a consultant and so often the customer wants to know 'what's going on' in their data, such that the minute I map data from a space they get to a some derived space sends them into conniptions. So I'd be interested in knowing how to make the principal components explainable. Often for the case of making a good prediction, no one cares. But in trying to get the customer to sign off on my work, they do care.<br /><br />thanks again<br />timtiptoeshttps://www.blogger.com/profile/13766634932593143786noreply@blogger.comtag:blogger.com,1999:blog-535632990291000665.post-11979151379447709812014-12-04T18:44:10.111-05:002014-12-04T18:44:10.111-05:00Thank you! And you are, of course, most welcome. G...Thank you! And you are, of course, most welcome. Glad you enjoyed the post.Myles Harrisonhttps://www.blogger.com/profile/03459119352415846492noreply@blogger.comtag:blogger.com,1999:blog-535632990291000665.post-90629948155695849302014-12-04T15:27:19.434-05:002014-12-04T15:27:19.434-05:00How interesting! The aviation industry is so vast...How interesting! The aviation industry is so vast, and there are so many different areas you could analyze. I really enjoyed reading your findings. Thanks so much for sharing this!Valeriehttp://www.avjet.com/noreply@blogger.comtag:blogger.com,1999:blog-535632990291000665.post-88440290030695072162014-06-23T19:11:28.710-04:002014-06-23T19:11:28.710-04:00You should update the analysis to show what will h...You should update the analysis to show what will happen should Delta add the A380 to its fleet. This article got me thinking about that: <br /><br />http://www.businessweek.com/articles/2014-06-23/can-the-airbus-a380-crack-a-u-dot-s-dot-fleet-yes-and-heres-why#r=hpt-fsNguyen Van Falkhttps://www.blogger.com/profile/07459960525696568638noreply@blogger.com