Analyzing Brazilian Deputies’ Votes with Tableau

8 min readNov 16, 2023

I like to think of data analysis as creating a new painting. The data is my canvas, and the tools I use to extract insights from it are my brushes. Sometimes it can be difficult to draw these insights, so there are tools that can help us. Today, we are going to talk about one of them.

Introduction

The history behind this post is the fact that I have been learning Tableau for the past few months in order to use it in my job and I have to say that it has been a journey full of knowledge. Tableau is a very powerful tool and if one knows how to handle it then it can become a great asset in the process of analyzing and extracting insights from any set of data.

That being said, I am not yet a Tableau expert so I was looking for some kind of data analysis project that I could tackle in my free time and use to enhance my Tableau skills. Then I came across this link that contains the results of some relevant projects voted in the Brazilian Chamber of Deputies from March 2019 to December 2022. This link is from a well-known Brazilian news portal called G1. This gave me the idea to create a Tableau dashboard to better analyze those deputies’ votes.

I chose to use the same data as G1 because they had already picked the most important voted projects. This meant I had to scrape their website to collect its data. In the next section of this post, I’ll take a moment to explain how I accomplished this.

Web Scrapping

To collect data from the website, I used Python along with Selenium, which is a tool that helps computer programs work with web browsers like Chrome or Firefox. It’s commonly used to automate tasks on websites, such as clicking buttons, filling out forms, and getting data.

Here’s a simple step-by-step overview of what I did:

1. I used Selenium to interact with the website.
2. I gathered all the details about the deputies’ votes.
3. I processed and saved this information in a CSV file that can be later imported into Tableau for analysis.

The first and third steps were quick to set up. However, the second step took some time. I had to carefully inspect the HTML code to locate the right tags and attributes that pointed to the specific data I needed. The following code demonstrates how to retrieve both the project name and its corresponding date by searching for specific class tags. Each string that I used in the find_elements method represents a CSS class that I found by looking at the HTML code.

project_name = driver.find_elements(By.CSS_SELECTOR, ".ProjectCard_title__1wX4b")[0].text
project_round = driver.find_elements(By.CSS_SELECTOR, ".ProjectCard_round__1B6kB")[0].text
project_name += " - " + project_round

project_vote_date = driver.find_elements(By.CSS_SELECTOR, ".ProjectCard_date__13ql0")[0].text

The whole Python script can be seen at the end of the post. After I ran the script, I got the following CSV file. Each row in the CSV file represents a deputy’s vote for a particular project on a specific date. The entire file contains 80,675 rows, involving 679 deputies from 32 different political parties.

Tableau

Access the dashboard here

Once I gathered the data, I successfully imported it into Tableau and began working on the dashboard. My goal was to design a user-friendly dashboard that would make it easy for users to understand how the votes were spread across different states and parties. To achieve this, I decided to incorporate a map to provide a visual representation. This was my first experience with map visualization in Tableau, and I found that Tableau simplifies the process, making it quite straightforward. The dashboard can be accessed here. As a way to reach a broader audience, I chose to add a filter that allows users to switch between Portuguese and English. In order to speed up the process of translating project names, I used ChatGPT so I could translate all 159 projects quickly.

The dashboard’s interaction is simple: users can browse through the projects using the filter on the top, and all the charts will update accordingly. They can also switch between the Percentage or Value views to see how the votes are distributed using the bar chart visualizations. The vote type shown in the top right corner bar chart can also be utilized to filter the party and table visualizations below. In the bottom right corner, we have a simple table that lists all the deputies and their respective votes. Moreover, the map serves a dual purpose. It acts as a heatmap, enabling users to observe how votes are distributed across the Brazilian states. Additionally, it functions as a filter, allowing users to choose a specific state on the map, which then filters the other visualizations accordingly.

Insights

Some insights that I had from using the dashboard:

Some projects clearly generated more disagreement than others. For instance, the proposal to include COAF (Council for Financial Activities Control) within the Ministry of Justice was one of the most controversial projects as one can see below. We can observe that a majority of the Northeast region voted against it.

I was also quite surprised when I noticed that some deputies voted against the proposal to revoke Flordelis’s mandate. For those who may not be aware, Flordelis was a female deputy who was found guilty of planning her husband’s murder among other crimes. Therefore, it was unexpected to see some individuals opposing it. The image below shows that a majority of the deputies voted in favor, while 7 voted against.

Conclusions and Tips

To be honest, I noticed that the dashboard was somewhat slow after publishing it. Overall, I believe the user experience is still acceptable, but I was expecting it to run a bit faster. The main problem happens when the user attempts to filter the dashboard by either clicking on the map or in the Vote bar chart as it takes a few seconds for the dashboard to update.
I faced a bit of a challenge when loading the CSV file into Tableau. I had to do it multiple times, and I observed that Tableau sometimes struggled to load the file accurately — either missing some rows or duplicating others. To avoid this, I made sure the total rows displayed in Tableau matched the total rows in the file.
A feature that I really liked is the visibility control option that Tableau provides. With that feature, I can give the user the possibility to hide and show any visualization in the dashboard based on a set of parameters. I used that to enable the Percent (%) and Language switches. The image below illustrates where to find it:

Another valuable tip is to implement a dashboard review process if possible. Constructing dashboards in Tableau is similar to creating any other software piece. Hence, it’s crucial to conduct tests and have someone else review the dashboard. This helps prevent errors and ensures accurate insights. To do that, one should remember to build a well-organized dashboard with meaningful and clear names for calculated fields, parameters, sheets, and so on. This can make the process of review much easier.
My overall impression is that Tableau makes analyzing data much quicker. After preparing my data, creating a useful dashboard was fast and easy with Tableau. A useful tip is to process all the data before loading it into Tableau. This way, Tableau can focus on making the best visualizations without dealing with data processing which can slow down the dashboard.

Source Code

Below, one can see the code I used to extract the data from the G1 website. One must install Selenium and Pandas to run it.

from selenium import webdriver
from selenium.webdriver.common.by import By
import pandas as pd
from data_to_eng import proj_translations

class VotesScrapper:
    def __init__(self, driver=webdriver.Firefox()):
        self.driver = driver
        self.link = "https://interativos.g1.globo.com/politica/2019/como-votam/camara-dos-deputados/brasil"
        self.number_pages = 14

    def go_next_page(self):
        next_page = None
        for item in self.driver.find_elements(By.CSS_SELECTOR, ".Pagination_item__1JGT8 "):
            if item.get_attribute('title') == "Próxima página":
                next_page = item
                break
        if next_page is not None:
            next_page.click()

    def get_info_card(self):
        project_name = self.driver.find_elements(By.CSS_SELECTOR, ".ProjectCard_title__1wX4b")[0].text
        round = self.driver.find_elements(By.CSS_SELECTOR, ".ProjectCard_round__1B6kB")[0].text
        project_name += " - " + round
        print(f"Collecting: {project_name}")
        
        vote_date = self.driver.find_elements(By.CSS_SELECTOR, ".ProjectCard_date__13ql0")[0].text
        vote_date = vote_date.replace("Votado em ", "")
        
        needed_votes = self.driver.find_elements(By.CSS_SELECTOR, ".BarChart_textNecessaryVotes__3BaO7")
        if needed_votes:
            needed_votes = needed_votes[0].text
            needed_votes = needed_votes.replace(" votos necessários", "")
        else:
            needed_votes = ""

        project_desc = self.driver.find_elements(By.CSS_SELECTOR, ".ProjectCard_description__2MES2")[0].text
        count_votes_list = (self.driver.find_elements(By.CSS_SELECTOR, ".Accordion_container__3K-ma"))
        info = []
        for vote_type in count_votes_list:
            vote = None
            if "| Sim" in vote_type.text:
                vote = "Sim"
            elif "| Não votou" in vote_type.text:
                vote = "Não votou"
            elif "| Não" in vote_type.text:
                vote = "Não" 
            elif "| Abstenção" in vote_type.text:
                vote = "Abstenção" 
            elif "| Obstrução" in vote_type.text:
                vote = "Obstrução"
            elif "| Ausente" in vote_type.text:
                vote = "Ausente"
            if vote is not None:
                vote_type.find_elements(By.CSS_SELECTOR,".Accordion_anchor__1HBuv")[0].click()
                congressmen_list = vote_type.find_elements(By.CSS_SELECTOR,".ListCandidates_listItem__3a7hN")
                for congressman in congressmen_list:
                    name = congressman.find_elements(By.CSS_SELECTOR, ".ListCandidates_candidateName__2NH81")[0].text
                    party = congressman.find_elements(By.CSS_SELECTOR, ".ListCandidates_candidateParty__1Wac9")[0].text
                    state = congressman.find_elements(By.CSS_SELECTOR, ".ListCandidates_candidateState__1FYOr")[0].text
                    info.append([project_name, vote_date, project_desc, name, party, state, vote, needed_votes])
                vote_type.find_elements(By.CSS_SELECTOR,".Accordion_anchor__1HBuv")[0].click()
        return info

    def translate_votes(self, voto):
        if voto == "Sim":
            return "Yes"
        elif voto == "Não":
            return "No"
        elif voto == "Não votou":
            return "Didn't vote"
        elif voto == "Abstenção":
            return "Abstention"
        elif voto == "Ausente":
            return "Absent"
        elif voto == "Obstrução":
            return "Obstruction"
    
    def build_df(self, data):
        df = pd.DataFrame(data, columns=["Projeto", "Data", "Descricao", 
                                         "Dept Nome", "Dept Partido",
                                         "Dept Estado", "Voto", "Votos necessarios"])
        df = df.drop_duplicates()
        #Creating English columns that will be used in the dashboard
        df["Project"] = df.Projeto.apply(lambda proj_name: proj_translations[proj_name])
        df["Vote"] = df["Voto"].apply(lambda voto: self.translate_votes(voto))
        
        df["Dept State"] = df["Dept Estado"]
        df["Dept Name"] = df["Dept Nome"]
        df["Dept Party"] = df["Dept Partido"]
        df["Date"] = df["Data"]
        return df
    
    def run(self, output_filename="data.csv"):
        self.driver.get(self.link)
        
        list_of_proj_cards = []
        for _ in range(self.number_pages):
            for card in self.driver.find_elements(By.CSS_SELECTOR, ".Card_wrapper__3sGYE "):
                list_of_proj_cards.append(card.find_elements(By.CSS_SELECTOR, ".Card_link__2A8I-")[0].get_attribute('href'))
            self.go_next_page()

        all_data = []
        for idx, card_link in enumerate(list_of_proj_cards):
            print(f"Processing: {idx+1} of {len(list_of_proj_cards)}")
            self.driver.get(card_link)
            all_data += self.get_info_card()

        df = self.build_df(all_data)
        df.to_csv(output_filename, index=False)
        self.driver.quit()
        print("Done !!!")
        
if __name__ == "__main__":
    VotesScrapper().run()