Data Exploration and Statistical Analysis of Facebook Campaigns using R

Kathleen Lara
8 min readJan 9, 2021

Most of the data exploration and predictive analysis projects I did uses python. For this article, I decided to use R to do a quick data exploration and statistical analysis of a facebook campaigns dataset available for research.

image from pixabay

Business Problem

Let’s say you’re a business and you would like to get an idea of how your social media posts and campaigns performance looks like and how you could save costs by focusing on initiatives that actually positively affects the overall success.

Some questions you might have are, how often does your team posts every week or what months do they usually post or what variables affect the interactions more.

Formulating a Hypothesis

Having good content and creative visuals is important. The management then thinks, maybe that’s enough, maybe we don’t need to do paid posts to get more visibility or interactions. You on the other hand, is pitching to allocate more budget to get more interactions. Then we formulate a hypothesis:

Ho: Paid posts means higher interactions

Ha: Paid posts does not mean having higher interactions

Let’s now explore the data and see what would be useful for us in our hypothesis testing.

Introduction to the dataset

The Facebook metrics is a real data set with 19 columns and 499 observations of a renowned cosmetic brand. It is entirely quantitative with only a few columns being ordinal. The dataset can be found in this link.

# loading the libaries needed
library(tidyverse)
library(ggplot2)
#loading the facebook csv data hosted at data world
facebook = read_csv("https://query.data.world/s/3vb7yr2iodbwybrckhyv5rbwfh2tct")

After loading the dataset, we want to look at the variables and check the data types to see if we need to do any transformations. We also want to get an idea of what the values are and make sure we can work with them.

# Checking Data Types
sapply(facebook, typeof)
Kathleen Lara

I’m a Boston based Data Scientist with a background in Data Engineering and Statistics.🤓