Trying to setup the Make-Data-Count plugin (not sure the terminology here). I've tried installing different python versions (3.7.17 - 3.9.18) and I keep getting the same error
$ sudo su - counter -c "cd /usr/local/counter-processor-0.1.04/ && CONFIG_FILE=counter-processor-config.yaml python3 main.py"
Traceback (most recent call last):
File "main.py", line 2, in <module>
import config
File "/usr/local/counter-processor-0.1.04/config/__init__.py", line 1, in <module>
from .config import *
File "/usr/local/counter-processor-0.1.04/config/config.py", line 3, in <module>
from models import *
File "/usr/local/counter-processor-0.1.04/models/__init__.py", line 2, in <module>
from .db_actions import DbActions
File "/usr/local/counter-processor-0.1.04/models/db_actions.py", line 4, in <module>
from .log_item import LogItem
File "/usr/local/counter-processor-0.1.04/models/log_item.py", line 9, in <module>
class LogItem(BaseModel):
File "/home/counter/.local/lib/python3.7/site-packages/peewee.py", line 6172, in __new__
cls._meta.add_field(name, field)
File "/home/counter/.local/lib/python3.7/site-packages/peewee.py", line 5961, in add_field
field.bind(self.model, field_name, set_attribute)
File "/home/counter/.local/lib/python3.7/site-packages/peewee.py", line 4728, in bind
self._db_hook(model._meta.database)
File "/home/counter/.local/lib/python3.7/site-packages/peewee.py", line 4720, in _db_hook
self._constructor = database.get_binary_type()
File "/home/counter/.local/lib/python3.7/site-packages/peewee.py", line 3680, in get_binary_type
return sqlite3.Binary
AttributeError: 'NoneType' object has no attribute 'Binary'
I'm not sure what I'm doing wrong here. It seems like running the main.py command results in this peewee error. I've tried updating peewee as well with the same error. Sorry if this isn't the wrong place to post this. I just signed up for an account today and haven't had time to review what others are posting.
What version of Dataverse are you running?
v. 5.13 build 1244-79d6e57
Thanks, I want to see what we say at https://guides.dataverse.org/en/5.13/installation/prerequisites.html#counter-processor
Is this on Rocky? Ubuntu? What version, please.
I can give my exact steps. I wrote them into a script if that helps. The script more or less followed that page. It's RHEL8.8.
https://gist.github.com/DonRichards/44b5cd090606fa72a1c7ef5b692f9e47
If it helps, this is what we do in Ansible: https://github.com/GlobalDataverseCommunityConsortium/dataverse-ansible/blob/7d8a425a8e1293717f31c2fd6d01c1d1801cf625/tasks/dataverse-counter.yml
It looks like we use the same version, 0.1.04: https://github.com/GlobalDataverseCommunityConsortium/dataverse-ansible/blob/7d8a425a8e1293717f31c2fd6d01c1d1801cf625/defaults/main.yml#L141
I'll take a look
We can agree that ensure pip is a little weird, right? I'm sure it was me that added it before, I got into my pattern of python3 -m venv venv. Is that what you do? venv?
I didn't but I could. I normally only use venv for development (locally) and not in production.
I see. What do you do in production? When you encounter a requirements.txt file?
Wow, quite a list: https://github.com/CDLUC3/counter-processor/blob/v0.1.04/requirements.txt
All versions seem to be pinned, at least. That's good.
From your gist, it looks like you're doing this:
sudo -u counter python3 -m pip install -r requirements.txt
Plus installing PyYAML and upgrading peewee (like you mentioned).
Do you want me to try it your way?
I think I would stick with whatever version of peewee (whatever that is) from requirements.txt file.
And I'm not sure why you're installing PyYAML. Should it be in requirements.txt?
Oh. It is already. PyYAML==5.3.1
Basically, I don't want to bother with venv if it isn't a good test.
It was a result of troubleshooting (manually adding those).
That's what I figured. :big_smile:
I removed them and moved the logic into a venv. Reviewing your Ansible script I realize that I had incorrect permissions for the logs and a typo in a sed command.
Oh! Well, all credit to @Don Sizemore and @PΓ©ter Pallinger for the Ansible stuff!
Now I'm getting a "Permission denied: 'state/statefile.json'" but I'm not sure where that's at
Sounds like you're a lot closer.
I've been talking with DonS. We have him as a contractor.
Yes, thanks for the help
awesome
We do mention statefile.json here: https://guides.dataverse.org/en/5.13/developers/make-data-count.html#testing-make-data-count-and-your-dataverse-installation (dev guide)
Thanks, that's helpful
...
@Philip Durbin Is there a simple way to check to see if make-data-count is working? The log files are not 0 bytes and the api returns without errors but also without data.
No data?!?
Did you feed the logs files into Counter Processor?
Yes, that part "seems" to be working.
"seems" :grinning:
Yeah, I have no idea what I'm looking at.
Normally I'd know exactly what a system trying accomplish and how it works so I can better identify issues. But Dataverse is mostly new to me and I was handed this project in the middle of a massive upgrade to another system that I wasn't familiar with. Now I'm trying to refocus on this. And I'm a bit lost. My best understanding is that MDC is looking at the logs and creating a type of in-house analytics. I pointed the configs to the wrong logs initially and all I got was zero byte logs from MDC. Once I changed the pointer to the correct location of the logs the MDC logs started getting bigger and the API calls were no longer seeming to respond with errors.
Does the diagram at https://guides.dataverse.org/en/6.0/admin/make-data-count.html#architecture help?
You might need to open the image in a new tab to read it: https://guides.dataverse.org/en/6.0/_images/make-data-count.png
Double checking a few things. Is this line correct in my config file or am I supposed to change it to something specific?
log_name_pattern: '/usr/local/payara5/glassfish/domains/domain1/logs/counter_(YYYY-MM-DD).log'
Ownership of logs/ is counter :dataverse
The demo says payara6 but we have version 5. Is that a problem?
I also set the output_file to this
output_file: /usr/local/counter-processor-0.1.04/tmp/make-data-count-report
with the tmp/ directory ownership is also counter:dataverse
And to run a test I'm running this
sudo -u counter sh -c "CONFIG_FILE=$COUNTER_PROCESSOR_DIRECTORY/counter-processor-config.yaml YEAR_MONTH=$YEAR_MONTH venv/bin/python3 main.py >> /usr/local/counter-processor-0.1.04/tmp/counter_daily.log"
run counter-processor as counter user
Traceback (most recent call last):
File "main.py", line 51, in <module>
main()
File "main.py", line 29, in main
with open(lf) as infile:
FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/payara5/glassfish/domains/domain1/logs/counter_(YYYY-MM-DD).log'
The image really didn't help clarify for me. I guess I need to know exactly what folders & files to create, what exact permissions need to be set for the different directories and files, what exactly do I need to modify for the config file & what is the expected file path for the config file, am I correct on my location for the make-data-count-report.json file, does the output_file in the config need to be an absolute path or can it be relative, and what exactly is addUsageMetricsFromSushiReport doing?
MDC should work fine on Payara 5.
What if you do lowercase like this?
log_name_pattern: /usr/local/payara5/glassfish/domains/domain1/logs/mdc/counter_(yyyy-mm-dd).log
yyyy-mm-dd instead of YYYY-MM-DD, I mean
Basically the same. All of the files in that directory are zero bytes. Not sure if that means anything.
Traceback (most recent call last):
File "main.py", line 51, in <module>
main()
File "main.py", line 29, in main
with open(lf) as infile:
FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/payara5/glassfish/domains/domain1/logs/mdc/counter_(YYYY-MM-DD).log'
When I switched to the parent directory it's the same error but the files aren't empty.
Traceback (most recent call last):
File "main.py", line 51, in <module>
main()
File "main.py", line 29, in main
with open(lf) as infile:
FileNotFoundError: [Errno 2] No such file or directory: '/usr/local/payara5/glassfish/domains/domain1/logs/counter_(YYYY-MM-DD).log'
Bah, it's quittin' time here but please remind me next week.
Just a reminder
Ha, thanks. Standup in 5 minutes and other meetings but I'll try to circle back.
No problem. I have meetings today as well, feel free to take your time and don't feel obligated. Thanks for all of the help
Is it crazy talk to ask for a shell on your server?
lol. It would be super helpful to know all of the folder permission (which I did kinda get from your Ansible Scripts) and where everything needs to be. A simple script to check configurations and/or some automated feedback would have been helpful during this process.
I'm not sure if this helps but I wrote this script to do everything to set this server up. It also shows the counter_dail.sh script I'm using to test this. https://gist.github.com/DonRichards/44b5cd090606fa72a1c7ef5b692f9e47
Right. I remember that gist. I'm just thinking it might be nice if we were on the same server. Would you want to spin one up that we could both ssh to? And use dataverse-ansible to install Counter Processor on?
JHU has so many layers of security it's complicated to get anything outside of the firewall.
Yeah, I was thinking about an EC2 instance on AWS.
Just in case that makes things easier to expose to outside: https://butlerx.github.io/wetty/
I'll talk to my system admin in the morning and see if it's possible with our setup
Thanks. And who else in the community has set MDC up, I wonder. It's been a while for me. :sweat_smile:
I'm trying to spin up an EC2 instance with MDC installed by dataverse-ansible.
Hmm, it's half-configured: http://dev2.dataverse.org
@Don Richards it looks like @luddaniel is setting up MDC, Counter Processor, etc. Please see #community > Make Data Count
Maybe he can help!
Did you ever get it working? :sweat_smile:
That's interesting. Thanks.
Sure. Sorry this has been so difficult.
Last updated: Oct 30 2025 at 06:21 UTC