• About Me Card

Max Hemingway

~ Musings as I work through life, career and everything.

Max Hemingway

Tag Archives: Architecture

Data Scientist Skill Set

05 Monday Jan 2015

Posted by Max Hemingway in Data Science

≈ 1 Comment

Tags

Architecture, Data Science

O’Reilly released a free downloadable report a while back that presents the results of a survey of Data Scientists across the industry – circa 250 respondents. The report looks at a list of skills and classifies Data Scientists into 4 main categories:

  • Data Businessperson
  • Data Creative
  • Data Developer
  • Data Researcher

Under each of these headings the roles are defined as:

DS+Types

As an Architect I can see a fit to the “Jack of All Trades” box, however I think that there is a reach across the Researcher, Creative and Businessperson categories if we were to be classed. However as an Architect it is important to understand the skills that a Data Scientist needs across these areas as going forward there will be more opportunities to work side by side with Data Scientists in solutions and architectures.

The report gives a list of skills that a Data Scientist has under each classification of Data Scientist

  • Algorithms (ex: computational complexity, CS theory)
  • Back-End Programming (ex: JAVA/Rails/Objective C)
  • Bayesian/Monte-Carlo Statistics (ex: MCMC, BUGS)
  • Big and Distributed Data (ex: Hadoop, Map/Reduce)
  • Business (ex: management, business development, budgeting)
  • Classical Statistics (ex: general linear model, ANOVA)
  • Data Manipulation (ex: regexes, R, SAS, web scraping)
  • Front-End Programming (ex: JavaScript, HTML, CSS)
  • Graphical Models (ex: social networks, Bayes networks)
  • Machine Learning (ex: decision trees, neural nets, SVM, clustering)
  • Math (ex: linear algebra, real analysis, calculus)
  • Optimization (ex: linear, integer, convex, global)
  • Product Development (ex: design, project management)
  • Science (ex: experimental design, technical writing/publishing)
  • Simulation (ex: discrete, agent-based, continuous)
  • Spatial Statistics (ex: geographic covariates, GIS)
  • Structured Data (ex: SQL, JSON, XML)
  • Surveys and Marketing (ex: multinomial modeling)
  • Systems Administration (ex: *nix, DBA, cloud tech.)
  • Temporal Statistics (ex: forecasting, time-series analysis)
  • Unstructured Data (ex: noSQL, text mining)
  • Visualization

ML = Machine Learning

OR = Operations Research

From reading other reports this is by no means a full list of skills but provides a good insight into what a Data Scientist needs in their skills bag.

The report then looks at typical tasks that would be covered by each category and splits these into 22 core tasks across 5 main tasks.

Data+Science+Skills+2

The visualisation below illustrates the results showing the skills and tasks across each Data Scientist type to show a percentage of skill that is needed.

Data+Science+Skills

Overall a good report giving a highlight of the business areas and skills of a Data Scientist

Report Source

Analyzing the Analyzers

An Introspective Survey of Data Scientists and Their Work

http://www.oreilly.com/data/free/analyzing-the-analyzers.csp

Share this:

  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Email a link to a friend (Opens in new window) Email
  • Share on Pinterest (Opens in new window) Pinterest
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Pocket (Opens in new window) Pocket
  • Share on Telegram (Opens in new window) Telegram
  • Share on Threads (Opens in new window) Threads
  • Share on WhatsApp (Opens in new window) WhatsApp
  • Share on Mastodon (Opens in new window) Mastodon
  • Share on Bluesky (Opens in new window) Bluesky
  • Share on Nextdoor (Opens in new window) Nextdoor
Like Loading...

Playing a Game with Innovation and Thinking

19 Friday Dec 2014

Posted by Max Hemingway in Architecture, DevOps/OpsDev, Innovation

≈ 4 Comments

Tags

Architecture, DevOps, Information Theory, Innovation, OpsDev

I have been looking at ways to assist me with Innovation and Thinking and looking outside of the box. Lots of different methodologies exist and there is no right or wrong way to what method to use or when to apply it.

After studying several methods in this arena and investigating, reading and learning some of these, I have come up with a set of “Playing Cards” that allow me to play games with Innovation and Thinking.

I took a pack of plain/blank playing cards and wrote out cards with different methodologies and ways of tackling/working on innovation.

Innovation Cards

The Pack is currently based on 3 models and I am looking to add a few more as I develop the pack (Other methodologies are available)

  • 4 Site Model
  • Peter Drucker Thinking
  • SCAMPER

I have also added some:

  • Problem challenge cards – to add different problems to the area you are working on
  • Lens Cards – to challenge you to look at innovation through different lenses or view points

How to play the game

For the problem or area that I am wanting to tackle I shuffle the pack and apply 4-5 cards then work through it based on what has been dealt.

Dealt Innovation Cards

The lens cards may be shuffled in the main pack or dealt at the side one at a time.

Set a time limit on the cards dealt and then brainstorm writing everything down.

No thought or idea is a bad idea until it is qualified in or out.

When the time is up either play a different lens card against the cards on the table – or collect them up and shuffle the deck and start again.

Results

I have found that using the cards gives me different view across different methodologies rather than just applying one.

Sometimes the cards do not result in too much on the page, but other times they flourish ideas and innovations around the problem or area I have been looking at.

Next I plan to add more methodologies to the pack and expanding the cards already produced, although I don’t think that I will expand this pack much more as it then may become cumbersome and be too large to be effective.

I do have some blank cards left though so may innovate something new around the next thing to do with them.

Share this:

  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Email a link to a friend (Opens in new window) Email
  • Share on Pinterest (Opens in new window) Pinterest
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Pocket (Opens in new window) Pocket
  • Share on Telegram (Opens in new window) Telegram
  • Share on Threads (Opens in new window) Threads
  • Share on WhatsApp (Opens in new window) WhatsApp
  • Share on Mastodon (Opens in new window) Mastodon
  • Share on Bluesky (Opens in new window) Bluesky
  • Share on Nextdoor (Opens in new window) Nextdoor
Like Loading...

Learning Data Science

12 Friday Dec 2014

Posted by Max Hemingway in Architecture, Data Science

≈ 5 Comments

Tags

Architecture, Data Science

As a Architect I am always looking into ways of working with new technologies, innovations and fields. One of this is Data Science and as such I am currently undertaking a set of courses to get an understanding of the field of Data Science.
Having a level of understanding will allow me to work more closely with data scientists in helpling them with suitable solutions as well as increase my skills in manipulating data.

The John Hopkins University are currently running a set of courses on Data Science consisting of 9 modules. Below is a dependency chart for anyone wanting to take these
Coursera Johns Hopkins Specialization in Data Science course dependency information

There are 9 modules to take which can be done free of charge or you can pay about $30.00 per course to get a certificate and take a final capstone project to test your understanding.

Each course lasts around 4 week and consists of video based lectures, forums, projects and knowledge test quizes.

John Hopkins consider two forms of dependencis for these courses:

Hard dependency: Students will be required to know material from the prerequisite course. Taking the dependent course simultaneously will be challenging and only possible for highly motivated students willing to work ahead of the course schedule for the prerequisite. Taking hard dependent courses out of order is not possible unless the student already knows the material covered in the prerequisite course.

Soft dependency: Knowledge of material from the prerequisite course is recommended and useful. Concurrently taking the prerequisite course and the dependent course is possible. It is not recommended to take them out of order, but would be possible for highly motivated students willing to self teach components of the prerequisite course as needed.

The courses are listed below in order that they should be taken in with links to the courses.
The Data Scientist’s Toolbox
https://www.coursera.org/course/datascitoolbox

This is the primary introductory course for the specialization. It should be taken first and has no prerequisite courses. Students should be computer literate, have programmed in at least one computer language and be motivated self learners.

R Programming
https://www.coursera.org/course/rprog

This is the most crucial course for the remainder of the specialization. It is softly dependent on The Data Scientist’s Toolbox. It should be taken before the remaining courses in the series.

Getting and Cleaning Data
https://www.coursera.org/course/getdata

This course has hard dependencies on R Programming and The Data Scientist’s Toolbox.

Exploratory Data Analysis
https://www.coursera.org/course/exdata

This course has hard dependencies on R Programming and The Data Scientist’s Toolbox.

Reproducible Research
https://www.coursera.org/course/repdata

This course has hard dependencies on R Programming and The Data Scientist’s Toolbox.

Statistical Inference
https://www.coursera.org/course/statinference

This course has hard dependencies on R Programming and The Data Scientist’s Toolbox. In addition, students will need basic (non calculus) mathematics skills.

Regression Models
https://www.coursera.org/course/regmods

This course has hard dependencies on R Programming, The Data Scientist’s Toolbox and Statistical Inference.

Practical Machine Learning
https://www.coursera.org/course/predmachlearn

This course has hard dependencies on R Programming, The Data Scientist’s Toolbox and Regression Models. It has a soft dependency on Exploratory Data Analysis.

Developing Data Products
https://www.coursera.org/course/devdataprod

This course has hard dependencies on R Programming, The Data Scientist’s Toolbox and Reproducible Research. It has a soft dependency of Exploratory Data Analysis.

*material from Coursera John Hopkins University
https://www.coursera.org/jhu

Share this:

  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Email a link to a friend (Opens in new window) Email
  • Share on Pinterest (Opens in new window) Pinterest
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Pocket (Opens in new window) Pocket
  • Share on Telegram (Opens in new window) Telegram
  • Share on Threads (Opens in new window) Threads
  • Share on WhatsApp (Opens in new window) WhatsApp
  • Share on Mastodon (Opens in new window) Mastodon
  • Share on Bluesky (Opens in new window) Bluesky
  • Share on Nextdoor (Opens in new window) Nextdoor
Like Loading...

A formula for Innovation

08 Monday Dec 2014

Posted by Max Hemingway in Innovation

≈ Leave a comment

Tags

Architecture, Innovation

I have been looking into innovation and what drives innovation for some time now, whilst bring it and show it in my roles.

I started to look at what innovation actually is and came across various equations for innovation that people have tried to create. These equations work in various ways, from consultancies on how they apply innovation to evaluating and overcoming the resistors to innovation, however applicable I wanted to apply a formula to part of the challenge so came up with:

A Forumla for Innovation

A Forumla for Innovation

Simple but works with what I am trying to achieve for now. Maybe needs some refinement as I work with it going forward.

One thing that stands out for me though is the following question:

Is Reuse Innovation?

A new thing to some people could be considered innovation, even though its infact reuse, because they have not been exposed to it before.

Share this:

  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Email a link to a friend (Opens in new window) Email
  • Share on Pinterest (Opens in new window) Pinterest
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Pocket (Opens in new window) Pocket
  • Share on Telegram (Opens in new window) Telegram
  • Share on Threads (Opens in new window) Threads
  • Share on WhatsApp (Opens in new window) WhatsApp
  • Share on Mastodon (Opens in new window) Mastodon
  • Share on Bluesky (Opens in new window) Bluesky
  • Share on Nextdoor (Opens in new window) Nextdoor
Like Loading...

“If it’s obvious prove it. If you can’t prove it, it’s not obvious.”

05 Friday Dec 2014

Posted by Max Hemingway in Governance

≈ 2 Comments

Tags

Architecture, Proving It

This is a phrase that I use a lot and I first came across many years ago from someone I previously worked with. Since then it has stuck with me.

When writing documents how often do we assume that the reader will know what we mean or understand that just because we know something is there that they do. I have seen many occasions and have fallen into the trap occasionally myself where you write about something in the manner that you know all the facts but don’t convey them.

An example of this could be a proposal or technical document;

The device has two power supplies;

  • To a technical mind the instant reaction might be that this will probably be connected to two separate power supplies and backed up by generators and UPS.
  • To a financial mind the instant reaction might be that this is extra cost not justified.
  • To the engineer who checks the proposal – I wonder how thats going to be configured?

Where in fact the writer forgot to mention that the device was a chassis that needed two power supplies to provide enough power to all the devices placed into that chassis and is fed from one power supply.

OK – in reality you should always look for redundancy and in this example that could equal four power supplies, but this example shows how easy one statement can be misinterpreted because it was obvious to the writer and not the reader.

Just food for thought… Try running that phrase against the next document, email, etc that you write and put yourself in the readers place.

Share this:

  • Share on Facebook (Opens in new window) Facebook
  • Share on LinkedIn (Opens in new window) LinkedIn
  • Email a link to a friend (Opens in new window) Email
  • Share on Pinterest (Opens in new window) Pinterest
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Share on Tumblr (Opens in new window) Tumblr
  • Share on Pocket (Opens in new window) Pocket
  • Share on Telegram (Opens in new window) Telegram
  • Share on Threads (Opens in new window) Threads
  • Share on WhatsApp (Opens in new window) WhatsApp
  • Share on Mastodon (Opens in new window) Mastodon
  • Share on Bluesky (Opens in new window) Bluesky
  • Share on Nextdoor (Opens in new window) Nextdoor
Like Loading...
Newer posts →

Follow Me on LinkedIn

www.linkedin.com – Click to Follow 

RSS Feed

RSS Feed RSS - Posts

Other Publications I contribute to

https://sparrowhawkbushcraft.com/

Recent Posts

  • 35 Informative Tech Websites for 2026: Enhance Your PKMS
  • Unlocking the Benefits of Journaling
  • Mastering Engaging Presentations: Tips for Captivating Your Audience
  • The Business Value of Enterprise Architecture Explored
  • Celebrating 150+ Hours of STEM Impact

Categories

  • 21st Century Human
  • 3D Printing
  • AI
  • Applications
  • ArchiMate
  • Architecture
  • Arduino
  • Automation
  • BCS
  • Big Data
  • Certification
  • Climate Change
  • Cloud
  • Cobotics
  • Connected Home
  • Data
  • Data Fellowship
  • Data Science
  • Development
  • DevOps/OpsDev
  • Digital
  • DigitalFit
  • Drone
  • Enterprise Architecture
  • F-TAG
  • Governance
  • Health
  • Innovation
  • IoT
  • IT Strategy
  • Machine Learning
  • Metaverse
  • Micro:Bit
  • Mindset
  • Mobiles
  • Networks
  • Open Source
  • Podcasts
  • Productivity
  • Programming
  • Quantum
  • Raspberry Pi
  • Robotics
  • Scouting
  • Scouts
  • Security
  • Smart Home
  • Social Media
  • Space
  • STEM
  • Story Telling
  • Technologists Toolkit
  • Tools
  • Uncategorized
  • Wearable Tech
  • Windows
  • xR

Archives

Reading Shelf

Archives

Recent Posts

  • 35 Informative Tech Websites for 2026: Enhance Your PKMS
  • Unlocking the Benefits of Journaling
  • Mastering Engaging Presentations: Tips for Captivating Your Audience
  • The Business Value of Enterprise Architecture Explored
  • Celebrating 150+ Hours of STEM Impact

Top Posts & Pages

  • 35 Informative Tech Websites for 2026: Enhance Your PKMS
  • Unlocking the Benefits of Journaling
  • My Virtual Selfie - Avatars and Identity Security
  • Mastering Performance Under Pressure: The Importance of Training
  • A-Z of Digital – Z is for Zabeta
  • The Importance of ArchiMate and UML in Modern Organisations
  • Technology Couch Podcast – Episode 3
  • Stay Ahead of the Curve: Essential Strategies for Technologists to Stay Informed
  • 2026 PKMS Updates: Boost Productivity and Knowledge Retention
  • Lunch? No I've got a meeting!

Category Cloud

21st Century Human Architecture Big Data Cloud Data Data Science Development DevOps/OpsDev Digital DigitalFit Enterprise Architecture Governance Innovation IoT Machine Learning Mindset Open Source Podcasts Productivity Programming Raspberry Pi Robotics Security Social Media STEM Story Telling Technologists Toolkit Tools Uncategorized Wearable Tech

Tags

3D Printing 21st Century Human AI Applications ArchiMate Architecture Automation BCS Big Data Blockchain business Certification Cloud Cobot Cobotics Coding Communication Connected Home CPD cybersecurity Data Data Fellowship Data Science Delivery Development DevOps Digital DigitalFit Digital Human Drone Email Enterprise Architecture Governance GTD Infographic Information Theory Innovation IoT Journal Knowledge learning Machine Learning Metaverse MicroLearning Mindset Mixed Reality Networks Open Source OpsDev PKMS Podcasts Productivity Programming Proving It Quantum quantum-computing R RaspberryPI Robot Robotics Scouts Security Smart Home Social Media STEM Story Telling Technologists Toolkit technology Technology Couch Podcast Thinking Tools Visualisation Voice Wearable Tech xR

License

Creative Commons Licence
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.

Meta

  • Create account
  • Log in
  • Entries feed
  • Comments feed
  • WordPress.com

Blog at WordPress.com.

  • Subscribe Subscribed
    • Max Hemingway
    • Join 82 other subscribers
    • Already have a WordPress.com account? Log in now.
    • Max Hemingway
    • Subscribe Subscribed
    • Sign up
    • Log in
    • Report this content
    • View site in Reader
    • Manage subscriptions
    • Collapse this bar
 

Loading Comments...
 

    %d