To try everything Brilliant has to offer-free-for a full 30 days, visit brilliant.org/JohnWatsonRooney . The first 200 of you will get 20% off Brilliant’s annual premium subscription.
@mathewrtaylor
7 ай бұрын
Great tips! A suggestion as an addition, I would add to help with the potential memory issues of all the data frames is to import gc, and in your processing loop add: del [[df]] gc.collect This will delete and remove from memory the df dataframe since you've already appended it to data in each iteration. This has saved me several times with memory issues. Hope this helps!
@JohnWatsonRooney
7 ай бұрын
thanks thats very helpful!
@kashifrana6798
2 ай бұрын
Can you please exactly let me know where we need to add these two lines? Thx
@oOBenictusOo
7 ай бұрын
I usually find your videos top-notch, but this case can also be permanently solved with Power Query in a maximum setup time of 5 minutes... looking forward to your next video, cheers.
@rcdny
7 ай бұрын
very true. since i learned power query i've saved 2hrs a day
@samuelmartins82
7 ай бұрын
he's just giving an example of the many types of automations that can be done with python. If you just need simple data processing, power query will do the job. But when you need to do something more complex, like getting some information from the web in real time, sending several emails with reports, etc.. power query will be useless
@oOBenictusOo
7 ай бұрын
@@samuelmartins82 Agreed, but here's the context as described in the title: it's a simple Excel task and not complex. Sure, it can be solved with Python, but I don't agree with the argument about wasting time in this context. This recurring task is solved faster and, in the long run, more efficiently with Power Query. Nonetheless, it's interesting because the presented solution could be the beginning of a slightly more complex solution.
@imconfused1237
7 ай бұрын
@@oOBenictusOoAgree, these techniques are ‘building blocks’ for much more involved and often fully bespoke ground-up solutions. It can be considered foundation knowledge. However, as you rightly say, simple ETL data operations are best met with optimised tooling, such as PowerQuery. It also avoids creating solutions which require expensive dev’s to support.
@eziola
7 ай бұрын
Keep doing data cleaning examples in Pandas! Could be a lot of views for videos like these. Great video! Thank you
@tmb8807
2 ай бұрын
I'm sure you know this by now, but the Polars read_csv method supports glob patterns, so the loop approach is unnecessary - you can simply pass _folder + "/*.csv"_ as the source parameter and the concatenation will be done automatically.
@JohnWatsonRooney
2 ай бұрын
Yea you are right that’s much better thanks for clarifying!
@adarshjamwal3448
7 ай бұрын
Great job john thanks for this content. I think your channel may be grow more. If you make and put another types of content like django, flask, javascript like that.
@tridibbiswas3361
7 ай бұрын
Thank you the video. These simple tutorials help people like to grasp some of the rudiments of the learning a programming language like python. BTW could you share the like to your dot files for nevoim. Would like to use of the stuff seen in your nevoim and figure out how to do it
@JohnWatsonRooney
7 ай бұрын
hey, thanks. my neovim is basically kickstart.ncim by teej, and a simple tmux. not a lot extra!
@tridibbiswas3361
7 ай бұрын
@@JohnWatsonRooney Thank you
@AliceShisori
7 ай бұрын
thank you for yet another highly-applicable video John. just out of curiosity since you made this video and I know you are more well-versed in data extraction, do you plan to make anything else about excel and data manipulation :D ?
@JohnWatsonRooney
7 ай бұрын
thanks - Yes I'd like to do a bit more on that side, i'll have a think and put some ideas together!
@Pankaj-Verma-
7 ай бұрын
Thanks.
@Kiriletto
4 ай бұрын
Great vid. Can you tell me why, after you have appended the files in the loop (which I assume merges them together within the list) you then concatenate them into a new data store (ie the result)? Aren’t you combining them twice? Thanks
@tmb8807
2 ай бұрын
The variable 'data' here is not a dataframe. It's a Python list which contains multiple distinct dataframes (one for each file in 'files'). pd.concat is used to append them all into a single dataframe object ('result').
@Rice0987
7 ай бұрын
While I'm using power query.🤗
@Rice0987
7 ай бұрын
But thank you for your channel, cause i'm have started to move in python direction. :)
@markbennett5626
7 ай бұрын
Even more powerful once Python in Excel is released to mainstream
@infamous8541
7 ай бұрын
Why not use knime instead?
@acharafranklyn5167
7 ай бұрын
I noticed you dont like using ipynb files.....Great content
@markbennett5626
7 ай бұрын
Only way to go tbh
@bakasenpaidesu
7 ай бұрын
.
@christiancoronado
7 ай бұрын
Excel is a fast, simple solution for 90% of people out there. Why overcomplicating things.
@crystalcleargirl07
7 ай бұрын
😂 because we have better things to do in life than opening/ importing each file in excel BY HAND where we can automate it
@christiancoronado
7 ай бұрын
Well i don't know what you do for a living, I am a CPA working for a large hospital in CA. And we only used Excel@@crystalcleargirl07
@christiancoronado
7 ай бұрын
@crystalcleargirl07 well I don't know what you do for a living. I am a CPA working for a large hospital in CA, and we only used Excel. What important job do you do that requires you to use Python
@kexec.
7 ай бұрын
if you know python, it is way easier than handling excel gimmicks like VBA. I'm speaking it with my experience.
@christiancoronado
7 ай бұрын
i get it, but i've been doing this for a living for the last 15 years, and I know both VBA and Python and it's just impractical to go for Python in 95% of cases. Also, PowerQuery can go above and beyond the excel row threshold and accomplish the same thing that was done here@@kexec.
@hoegge
7 ай бұрын
Good intro on how to do this in Python, but it is not faster than Excel in any way - neither first nor subsequent runs. A video demonstrating this in Excel would take less than 1-2 minutes.
Пікірлер: 49