Background

Derivatives of cannabinoids or marijuana have potential therapeutical applications for treating multiple psychiatric disorders or symptoms [1, 2]. Despite the recent surge of interest in their potential medical use, the application of these derivatives has been restricted by many side-effects that are related to the dosages used [3] and the physical state of users [4]. For the standardized pharmaceutical application of cannabis derivatives, it is of high importance to investigate the side-effects comprehensively and thoroughly. Extensive studies have demonstrated cannabinoids influence emotional, spatial learning/memory, and working memory through changes in the amygdala, hippocampus, and prefrontal cortex at different levels, notably the related synaptic plasticity changes [5,6,7]. Since the role of the striatum in operant learning is increasingly emphasized, more and more attention has been paid to the effects of cannabinoids on striatum-dependent learning recently [8, 9].

There are mainly two types of MSNs in the dorsal striatum, the direct pathway MSNs expressing dopamine receptor type 1 (D1R) and the indirect pathway MSNs expressing dopamine receptor type 2 (D2R), which have different projection targets and exert various functions [10]. Different types of neurons in the dorsal striatum which integrate diverse excitatory afferents from the cortex, thalamus, and dense innervation from midbrain dopamine neurons have been suggested to express high levels of CB1Rs [11,12,13] and exhibit multiple forms of synaptic plasticity mediating learning [8, 14]. It is often suggested that long-term potentiation (LTP) is mainly observed in D1 MSNs which is mediated by N-methyl-d-aspartate receptors (NMDAR), and long-term depression (LTD) mediated by CB1R and metabotropic glutamate receptor usually occur in D2 MSNs [15, 16]. However, various types of endocannabinoid-mediated synaptic plasticity have been observed in both D1 MSNs and D2 MSNs [12, 13, 17]. It has been widely considered that the endocannabinoid (eCB) system unidirectionally depresses neuronal communication on a short or long timescale, while recent reports unveiled that eCB-mediated LTP (eCB-LTP) also plays an important role in learning and memory [18], which is regulated by dopamine via D1R and D2R. The intricate dopamine-endocannabinoid system together with the direct/indirect pathways is widely reported to play a role in reinforcement learning.

Reinforcement learning is one of the most important forms of learning mediated by the striatum [10], which is commonly used as a behavioral intervention and assessment in different psychiatric disorders [19, 20]. It is a process to maximize reward (positive reinforcement learning) and evade aversive stimulus (negative reinforcement learning, NRL), which enables individuals to accumulate the environmental evidence and optimize behavioral strategies. Characterized by an initial response-reinforcer/outcome association phase, followed by a phase of habit (stimulus-response) learning [15, 21], reinforcement learning has been confirmed to be mainly regulated by two subregions of dorsal striatum respectively [21,22,23]. Acquiring the contingency/association between the response and the reinforcer is dominantly mediated by the dorsomedial striatum [21], while habit learning/expression with the defining feature of insensitivities to reinforcer devaluation and contingency degradation is more supported by the dorsolateral striatum (DLS) [22, 24].

Earlier studies reported that excitatory synaptic changes in the direct pathway mediate the initiation of motion and reward-based learning, while the excitatory synaptic changes of the indirect pathway mediate the inhibition of motion and avoidance-related learning [25]. Recent studies suggested a subtype of D1 MSNs expressing Teashirt family zinc finger 1 (Tshz1) also drive the negative reinforcement [Drugs and acute intraperitoneal injection

HU210 and AM281 were purchased from Sigma (USA) and dissolved in vehicle of 2:1:37 of dimethyl sulfoxide (DMSO): Tween-80: 0.9% saline. All drugs and vehicle were dispensed on the day of experiments. HU210 was acutely administrated an hour before the behavioral or electrophysiological experiments by intraperitoneal injection.

Tetrodotoxin (TTX) and picrotoxin (PTX) were purchased from Sigma (USA), which were made into concentrated stock solutions and diluted in artificial cerebrospinal fluid (ACSF) to the final concentration on the day of testing, and continuously poured into the recording chamber. CNO (purchased from BrainVTA, China) was pre-dissolved in DMSO and then diluted with 0.9% saline to a final concentration of 0.5% just before the experiments. For drug stocks prepared with DMSO, the final DMSO concentration was less than 0.1%.

Intracranial implantation and microinjections

C57BL/6J mice were anaesthetized with 0.8–1.5% isoflurane and fixed in a stereotaxic apparatus (SR-5; Narishige, Tokyo, Japan). To prevent eye injury, the ophthalmic ointment was applied after anesthesia. Guide cannulas (23 gauge with stylets) made of stainless steel tubing were implanted bilaterally into the dorsomedial striatum (AP: +0.6 mm; ML: ±1.5 mm; DV: −2.8 mm). After surgery, mice were allowed to recover for 7 days before the behavioral experiments. Intracranial infusion was administered using an injection needle (30 gauge) inserted through the guide cannula with the injection needle connected to 0.5-μL syringes with polyethylene tubes and controlled by an automated microinjection pump (World Precision Instruments, Sarasota, FL, USA). AM281 solution (0.3 mg/ml) of a total volume of 1.0 μL/mouse (0.5 μL/side) was injected half an hour before the HU210 intraperitoneal injection at a rate of 0.1 μL/min, or HU210 solution (0.2 mg/ml) with the same volume of AM281 was injected an hour before the learning experiment. After injection, the needles were left in place for an extra 3 min for drug diffusion. At the end of the experiment, mice were sacrificed under an overdose of urethane, and the brains were sliced to verify the cannula placements. The data were abandoned if the cannula tip was away from the target by > 0.5 mm.

Surgery and virus injections

Drd1-Cre mice were anesthetized with 0.8–1.5% isoflurane and received injections in a stereotaxic apparatus. To prevent eye injury, the ophthalmic ointment was applied after anesthesia. For the calcium imaging experiment, rAAV-hSyn-DIO-GCaMp6m-WPRE-pA (BrainVTA; approximate titer 5×1012 vg/ml) virus solution (200 nL) was injected unilaterally into DMS (AP: +0.6 mm, ML: −1.5 mm, DV: −2.8 mm) using an injection micropipette attached to a nanoinjector at speed of 30 nL per minute. Then, the micropipette was slowly withdrawn after the nanoinjector retained for 10 min 0.03 mm above the injection sites for 10 min for virus diffusion, and an optical fiber (200 μm core diameter, 0.37 numerical aperture (NA); Shanghai Fiblaser) was implanted and secured to the skull with dental cement.

For the DREADD experiments, chemogenetic activation and inactivation were achieved using hM3Dq DREADD (hM3Dq: the excitatory modified Gq-coupled human M3 muscarinic receptor expressed in AAV viral vectors, which can be exclusively activated by CNO) and hM4Di DREADD (the inhibitory modified Gi/o-coupled human M4 muscarinic receptor expressed in AAV viral vectors, which can be exclusively activated by CNO), respectively. In the case of DREADD activation, rAAV-hSyn-DIO-hM3Dq-mCherry virus (mCherry: mCherry fluorescence protein) (BrainVTA; approximate titer 2×1012 vg/ml) was bilaterally injected at the following coordinates for each mouse: AP: +1.18 mm, ML: ±1.2 mm, DV: −3.0 mm; AP: +0.6 mm, ML: ±1.5 mm, DV: −2.8 mm. For DREADD activation and control, rAAV-hSyn-DIO-hM4Di-mCherry and rAAV-hSyn-DIO-mCherry viruses are injected, respectively. A total volume of 120 nl was injected at each desired depth at the speed of 30 nl per minute. After every injection, the nanoinjector was retained 0.03 mm above the injection sites for 10 min, and then, incisions were sewed up with sterilized surgical suture after the micropipette was withdrawn.

After surgery, animals were allowed to recover in their home cages for 3 weeks before behavioral experiments. At the end of the experiments, mice were sacrificed with a urethane overdose, and the brains were sliced to verify the virus infection regions.

Fiber photometry calcium imaging and data analysis

Fiber photometry was used to record the population activity of neurons expressing the genetically encoded calcium indicator in real time. The light between the commutator and the implanted optical fiber was guided by an optical fiber (200-μm core diameter, 0.37 NA; 2 m long), and the laser intensity at the tip of the optical fiber was measured and adjusted to 10–20 μW to minimize photobleaching. The signals were collected by the multi-channel fiber photometry recording system (Thinkertech) and digitalized and recorded at 50 Hz by ThorCam-DAQ. Recordings were performed in open field as described above and last for 5 min. Before recording, mice were allowed to move freely for 5 min to acclimate.

After recording, data were processed using custom-written MATLAB software. The fluorescence change (dF/F) was estimated using (F(t) − F0)/F0, where (F(t) − F0) was calculated by subtracting the median fluorescence value of the whole session (F0) from the fluorescence value at each time point (F(t)) [45]. Only peaks whose amplitudes exceeded the median average deviation by 2.91 deviations were included in the peak event analysis [46].

Electrophysiological recordings

An hour after the HU210 or vehicle injection, mice were anaesthetized with isoflurane and decapitated. The forebrain was separated from the cerebellum by a coronal cut, divided into two parts by a sagittal cut from the longitudinal fissure, and immediately glued to a cutting stage immersed in oxygenated (95% O2 and 5% CO2) ACSF at physiological temperature (~ 34°C) containing (in mM): 125 NaCl, 2.5 KCl, 25 glucose, 25 NaHCO3, 1.25 NaH2PO4, 2 CaCl2, and 1 MgCl2 (pH 7.2-7.4). Sagittal slices (300 μm) through the striatum were quickly cut with a vibratome (VT 1200S, Leica, Germany) at a speed of 0.06 mm/s. Then, slices were incubated in the oxygenated ACSF for 30 min at 32 ~ 34°C to recover for 1 h. After recovering, a slice was transferred to a recording chamber and perfused with continuously oxygenated ASCF at 31 ± 1 °C. The recording pipettes had a resistance of 3–5 MΩ when filled with the RNase-free solution below (in mM): 140 Cs-methane sulfonate, 2 MgCl2, 0.2 EGTA, 10 HEPES, 4 Mg-ATP, 0.3 (Na2) GTP, 2 QX-314, 10 Na2-phosphocreatine (pH 7.2–7.4 with CsOH). MSNs in the dorsal striatum were visualized using an upright microscope (DM LFSA, Leica, Germany). Neurons were preliminarily selected and identified by their morphological features (flat appearance, medium size, large initial axon segment), and whole-cell recordings were carried out with a Multiclamp 700B amplifier (Molecular Devices, Sunnyvale, CA, USA). mEPSCs were recorded at a holding potential of −80 mV in the ACSF superfusate with 100 μM PTX and 1 μM TTX. Continuous recordings of mEPSCs for least 5 min were filtered at 2 kHz and digitized at 10 kHz using a Multiclamp 700B amplifier and a Digidata 1550 (Molecular Devices, USA). Recorded mEPSCs were analyzed using Clampfit (Molecular Devices, Sunnyvale, CA, USA) and detected based on a template-matching algorithm (threshold: amplitude > 4 pA; baseline: − 80 mV). The intrinsic properties were examined by current-clamp recording. Currents were set in steps of 25 pA, ranging from −200 to 300 pA, 1000ms duration, and neurons were recovered for 5 min before recording.

For function test of expressed hM3Dq and hM4Di protein, current-clamp recording was applied to measure evoked action potentials in CNO activation or inhibition experiment. D1 MSNs expressing hM3Dq or hM4Di were visually identified by mCherry. Current-clamp recording protocol was the same as aforementioned. Neurons were recovered before the brain slices were perfused with ACSF containing 10 μM CNO, and the same current-clamp procedure was performed 10 min after CNO perfusion.

Single-cell RT-PCR

To identify the type of recorded MSNs, single-cell RT-PCR was performed. After electrophysiological recording, a small negative pressure was applied to the patch pipette to attach the neuron. The pipette was gently withdrawn to pull the neuron off the slice. Then, the electrode with the attached neuron was put into a microcentrifuge tube containing 3 μL ddH2O and 0.5 μL of 40 U/μL RNasin (Promega, USA); subsequently, reverse transcription-polymerase chain reaction (RT-PCR) assay was performed to acquire its cDNA. The acquired cDNA fragments were amplified twice by nest PCR and subjected to 2% AGAR gel electrophoresis. Single-cell reverse transcription and PCR amplification were based on a patented method (see the supplementary methods and tables) [42]. All PCR reagents were from Takara (Japan) and the representative image of the agarose gel electrophoresis is shown in Additional file 1: Fig. S1.

Single-strand cDNA was synthesized in PCR tubes containing 2 μL mixed dNTPs (2.5 mmol/L each), 0.5 μL oligo (dT) primer (50 μmol/L), and 0.5 μL random primer (100 μmol/L) (Takara, Japan). The mixture was heated to 65 °C for 5 min and then chilled on ice for 1 min. After chilling, 2.5 μL 5 × RT Buffer and 0.75 μL Maxima Reverse Transcriptase (200 U/μL; Thermo Scientific, USA) were added and the mixture was held at the temperature of 25 °C for 10 min, 50 °C for 30 min, and 85 °C for 5 min, and finally kept at 4 °C. A multiplex single-cell nested-PCR was carried out for identification of dopamine receptor type of MSNs (Drd1 for D1, Drd2 for D2, GAD67 for GABA). Primers and amplicons are shown in Additional file 2 (Table S1), and the PCR reaction conditions are shown in Additional file 2 (Table S2). The second-round PCR products were identified by 2% agarose gel electrophoresis.

Histology

After calcium imaging and DREADD experiments, mice were deeply anesthetized with sodium pentobarbital (50 mg in a volume of 0.1 ml) and transcardially perfused with 5–10 ml 9% saline, followed by 10–15 ml chilled 4% paraformaldehyde. Brains were removed and post-fixed overnight at 4°C for 24 h, followed by dehydration in a solution of 30% sucrose in 0.1 M PBS for at least 2 days. Forty-micrometer coronal slices were prepared on a freezing microtome (CM1950, Leica Microsciences, Germany), and fluorescence images were taken with a laser scanning microscope (Leica, TCS SP5, Germany).

Statistical analysis

ANOVA for trial and current with repeated measures, unpaired t-test, and Mann-Whitney test were used to determine the difference between groups. Post hoc testing was conducted using Holm-Sidak’s test for multiple comparisons. Differences with p < 0.05 were considered statistically significant. All statistical analyses were conducted using SPSS version 20 (IBM Corp., Armonk, NY).

Availability of data and materials

All data generated or analyzed during this study are included in this published article and its additional files.